Online event

Building Real-Time RAG for Financial Data and News

Name: Building Real-Time RAG for Financial Data and News
Start: 2024-06-04
End: 2024-06-04
Location: Virtual

June 4, 2024

Virtual

We're teaming up with Microsoft and Unstructured to bring you an incredible workshop, hosted by AICamp.

In this workshop, we will focus on designing RAG pipelines through data flow pipeline design including Directed Graphs (DG), and the integration of real-time analytics.

Discover the knowledge and skills needed to set up and manage real-time Retrieval Augmented Generation (RAG) pipelines using both structured and unstructured financial data.

One of the challenges of financial data is how quickly it becomes irrelevant - both in terms of the market prices and any events around it.

The event is 🆓 free and expected to last ⏰ 2 hours.

Leverage Bytewax for integrating real-time analytics into your data processing workflows.
Incorporate Unstructured to process image and web based information
Incorporate Azure AI services to deploy and manage RAG pipelines

Below is a diagram showing how all the components work together.

graph MSFT workshop.png

❗️ More details you can find in out blog.

Guiding you through this experience will be

Zander Matheson

CEO, Founder at Bytewax

Zander is a seasoned data engineer who has founded and currently helms Bytewax. Zander has worked in the data space since 2014 at Heroku, GitHub, and an NLP startup. Before that, he attended business school at the UT Austin and HEC Paris in Europe.

Laura Funderburk

Senior developer advocate at Bytewax

Laura Funderburk has a B.Sc. Mathematics from Simon Fraser University, and over three years of experience as a professional data scientist. Laura is enthusiastic about using open source for MLOps and DataOps and is passionate about outreach and education. In her day to day, Laura creates written content around building end to end scalable LLM pipelines with streaming data.

Shagun Sharma

Data Scientist at Microsoft via TCS

Shagun Sharma is a highly skilled Data Scientist and AI Engineer with over five years of extensive experience in the field of Natural Language Processing (NLP). For almost four years, Shagun has been a pivotal part of the AI Co-Innovation Lab, globally leading multiple customer engagements around generative AI. Shagun's leadership has extended to labs in Redmond, WA, Montevideo, Uruguay, and San Francisco, CA. Shagun has successfully built more than 15 proof-of-concept (POC) projects leveraging Azure technologies, including Azure OpenAI, Azure AI Search, Azure Document Intelligence, LangChain, and other advanced tools

Nina Lopatina

Staff Developer Relations Engineer at Unstructured

Nina Lopatina is a Staff Developer Relations Engineer at Unstructured, where she helps customers make the most of their unstructured data for retrieval augmented generation (RAG) and other large language model (LLM) use cases. Nina has been primarily working on multilingual language modeling since 2018. In this span, she has worked on language classification and generation. Throughout her career, she has focused on the data that LLMs need to improve performance and reliability.

🙋‍♂️🙋‍♀️ Audience

🛠️ Data/ML/AI engineers;

🔬 Data scientists;

💻 Software engineers interested in data processing;

📊 IT professionals looking to understand and apply RAG.

For all communications

Jun 4, 2024, 5:00 PM •workshop

Workshop prerequisites

Basic Understanding of Data Structures and Algorithms

Familiarity with fundamental concepts in data structures and algorithms is required. This includes knowledge of arrays, linked lists, stacks, queues, trees, and basic algorithmic principles.
Proficiency in Python Programming

Comfortable with writing and debugging Python code. Experience with Python libraries commonly used in data processing and machine learning, such as Pandas, NumPy, and Scikit-learn.
Knowledge of Data Processing and ETL Concepts

Understanding of data extraction, transformation, and loading (ETL) processes. Experience with handling structured (e.g., CSV, JSON) and unstructured (e.g., text, images) data.

[Optional] To reproduce the solution - not required to participate in the webinar

Setup of Required Azure Services
- Azure AI Search: Follow the instructions here to create an Azure AI Search service.
- Azure OpenAI: Set up the Azure OpenAI service and deploy the models 'gpt4 (0613)' and 'text-ada-002-embedding' by following the instructions here.
Get Unstructured API Key and Install Unstructured Tools
- Obtain an API key from Unstructured by signing up on their platform.
- Follow API documentation to set up your connectors and ingest your documents to your destination.
Clone this repository and install dependencies
- git clone
- cd real-time-rag-workshop/
- Pip install -r requirements.txt

More details!

Speakers:

Nina Lopatina
Staff Developer Relations Engineer at Unstructured
Shagun Sharma
Data Scientist at Microsoft via TCS
Laura Funderburk
Senior developer advocate at Bytewax
Zander Matheson
CEO, Founder at Bytewax