Dear Bytewax Community,
As we continue our reflection on 2024, we're excited to share the third chapter of our year in review.
Things Built with Bytewax
One of the most exciting aspects of open source is seeing what people create with your tools. This year, the Bytewax community has truly impressed us with their creativity and ingenuity. The "Used By" feature on GitHub gives us a glimpse into the public repositories that integrate Bytewax—and the results are incredible.
Here are some standout projects that made us pause and say, “Wow!”:
🌲 Anomaly Detection for Air Quality: A real-time system that detects anomalies in live air quality data streams, built with Redpanda and Bytewax to provide instant, actionable insights.
📰 Real-Time News Search Engine: This project takes real-time news ingestion to the next level, combining Bytewax, Kafka, Upstash, and LangChain to enable semantic search for breaking news.
💹 Finance Feature Engineering: A 100% Python-based pipeline processes live trade data from Coinbase, transforms it into OHLC features with Bytewax, and stores it in Hopsworks for analysis.
Other notable mentions include Resumify (AI-generated tailored resumes), GutenbergV2 (a repository evaluator), LinkedIn RAG (embeddings for LinkedIn posts) and Spanda.ai (real-time embeddings with Weaviate's Verba for human-centered AI solutions). Bytewax is powering everything from clean transportation to next-gen search engines.
Have you built something cool with Bytewax? Share it with us!
Bytewax Cheatsheets
This year, Bytewax introduced not just one, but three impactful cheatsheets to elevate your data engineering workflows. Here's how each of them helps you master distributed dataflows and real-time stream processing with Bytewax:
1️⃣ Bytewax Cheatsheet —a concise yet powerful guide to help users quickly understand and implement Bytewax in their workflows.
- Dataflows as DAGs: Learn how Bytewax uses Directed Acyclic Graphs to transform raw data into actionable insights.
- Connectors Galore: From Kafka to files, Bytewax makes data ingestion and output seamless.
- Python Power: By leveraging Python’s ecosystem, you can integrate libraries like NumPy or Shapely for advanced transformations.
2️⃣ Bytewax Operator Cheatsheet — For developers working on distributed dataflows, this cheat sheet dives deep into operators—stateless and stateful.
- Stateless transformations like map and filter for basic data processing.
- Stateful transformations to maintain information across events for tasks like counting or aggregating data.
- Operator Highlights: Includes practical examples for real-world use, from enriching data with external sources to managing complex operations with Bytewax's powerful stateful_batch operator.
3️⃣ Bytewax Windowing Cheatsheet — Windowing is essential for working with continuous data streams. This cheatsheet explains how to divide unbounded streams into manageable "windows" for computation.
- Tumbling Windows: Fixed-size, non-overlapping intervals for real-time analytics.
- Sliding Windows: Overlapping windows for calculating moving averages or trends.
- Session Windows: Dynamically-sized windows based on activity, perfect for tracking bursts of user interaction.
- Core Concepts: Covers watermarks, clocks, and handling late or out-of-order data for accurate processing.
Podcasts: Conversations That Matter
Some of our favorite moments from 2024 came from sharing our story on podcasts:
1️⃣ Redis Podcast: Zander Matheson discusses how Bytewax simplifies real-time processing for developers and scales with ease.
2️⃣ Hopsworks Interview: Laura Funderburk highlights Bytewax’s contributions to real-time RAG workflows and embedding advancements.
3️⃣ AI Chronicles: Zander breaks down the challenges of RAG and Bytewax’s role in powering real-time AI solutions.
Workshops and Meetups: Advancing Real-Time Innovation
💡 OSS4AI Meetup: Hosted by Yujian Tang, this event spotlighted Laura Funderburk’s presentation on real-time RAG pipelines with Bytewax. The Bytewax team’s dedication shone through, with Oli Makhasoeva attending during her maternity leave to support the talk.
🌍 Supercharge Slackbots with RAG in real-time by Softlindia Workshop: With 480+ registrants from over 10 countries, this workshop, led by Henrik Nyman and Mikko Lehtimäki, provided deep dives into stateful dataflows, fault tolerance, and real-time deployment. Engaged questions elevated the sessions beyond expectations.
🎙️ Women Seattle Meetup: Focused on women in AI, this event featured inspiring talks, including Bytewax’s Oli Makhasoeva. It was a powerful mix of networking and thought leadership.
These events reflected Bytewax’s mission: advancing innovation through collaboration.
Looking Ahead
As we wrap up 2024, one thing is clear: this year wasn’t just about building a better Bytewax. It was about building a stronger community. Together, we’ve pushed the boundaries of what’s possible with real-time data processing, and we’re not stopping anytime soon.
What’s next? You’ll have to wait for the final part of our year in review to find out. But here’s a hint: it’s big.
For now, thank you for being part of our journey. Bytewax wouldn’t be Bytewax without you. 💛
Stay updated with our newsletter
Subscribe and never miss another blog post, announcement, or community event.