Из-за периодической блокировки нашего сайта РКН сервисами, просим воспользоваться резервным адресом:
Загрузить через dTub.ru Загрузить через ClipSaver.ruУ нас вы можете посмотреть бесплатно Simplifying AI integration on Apache Spark или скачать в максимальном доступном качестве, которое было загружено на ютуб. Для скачивания выберите вариант из формы ниже:
Роботам не доступно скачивание файлов. Если вы считаете что это ошибочное сообщение - попробуйте зайти на сайт через браузер google chrome или mozilla firefox. Если сообщение не исчезает - напишите о проблеме в обратную связь. Спасибо.
Если кнопки скачивания не
загрузились
НАЖМИТЕ ЗДЕСЬ или обновите страницу
Если возникают проблемы со скачиванием, пожалуйста напишите в поддержку по адресу внизу
страницы.
Спасибо за использование сервиса savevideohd.ru
Spark is an ETL and Data Processing engine especially suited for big data. Most of the time an organization has different teams working on different languages, frameworks and libraries, which needs to be integrated in the ETL Pipelines or for general data processing. For example, a Spark ETL job may be written in Scala by data engineering team, but there is a need to integrate a machine learning solution written in python/R developed by Data Science team. These kinds of solutions are not very straightforward to integrate with spark engine, and it required great amount of collaboration between different teams, hence increasing overall project time and cost. Furthermore, these solutions will keep on changing/upgrading with time using latest versions of the technologies and with improved design and implementation, especially in Machine Learning domain where ML models/algorithms keep on improving with new data and new approaches. And so there is significant downtime involved in integrating the these upgraded version. In this talk we will discuss about how Informatica integrates AI Solutions as part of data processing pipelines executing on top of Spark along with following major features 1. Data Science team can easily share their AI/ML solutions created using any library, language or framework 2. Shared AI/ML solution can be easily consumed in the spark pipeline. 3. Using Informatica products customers can enjoy drag and drop way of creating the Spark Pipeline with the selected solution(s). 4. Various teams can Continuously Integrate and Deploy (CI-CD) different solutions with minimum down time. In conclusion, we will understand how different teams (like Data Scientist and Data Engineer) can integrated their work together thereby reducing the time/cost consumed in collaboration. We will also understand how CI/CD is achieved on spark with minimum downtime while integrating various projects specially AI/ML projects using Informatica products. Thus, by using these features like drag-and-drop way of creating spark pipeline, easy/minimum collaboration between teams and CI-CD, organizations can drastically reduce overall project completion time and cost. About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business. Read more here: https://databricks.com/product/unifie... See all the previous Summit sessions: Connect with us: Website: https://databricks.com Facebook: / databricksinc Twitter: / databricks LinkedIn: / databricks Instagram: / databricksinc Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. https://databricks.com/databricks-nam...