Machine Learning

Build ML-Powered Applications with Streaming Data

The best ML applications leverage continuously generated data in real time. The user friendly Bytewax Python API enables your team to leverage the ML ecosystem with streaming data to deliver powerful user experiences.

Use Cases from Our Community

Introducing Real-time Data to Machine Learning

Manufacturing

Predictive Maintenance

Bytewax facilitates the real-time data processing from manufacturing sensors through machine learning models to predict equipment failures, enabling effective predictive maintenance and reducing unplanned downtime.

Finance

Fraud Detection

Machine learning algorithms analyze transactional data streams in real time to detect fraud, with Bytewax handling the necessary high-velocity data. This supports financial institutions in maintaining security and customer trust.

E-Commerce

Personalized Recommendations

ML models use real-time clickstreams to dynamically personalize product recommendations. Bytewax enables efficient data processing, allowing real-time recommendations based on current customer behaviors and preferences.

ML Connectors

Popular Connectors in Our ML Community

Amazon MSK
Connector

Amazon MSK

Sink & Source Open source
Hopsworks
Connector

Hopsworks FS

Sink Premium
Confluent
Connector

Confluent

Sink & Source Open source
Google Vertex AI
Connector

Google Vertex AI

Source Premium
Feast
Connector

Feast

Sink Premium
Amazon Sagemaker
Connector

Amazon SageMaker

Sink Premium
Architecture

Build Streaming Pipelines for Real-Time Machine Learning

ML Architecture

Machine Learning models are only as good as the input features you feed at training and inference time. For many real-world applications these features must be generated and served as fast as possible, so the ML system produces the best predictions possible.

Community projects

Check how the ML community is using Bytewax

2022-bytewax-redpanda-air-quality-monitoring
This repository features a real-time anomaly detection system that ingests live sensor data using Redpanda and processes it with Bytewax to identify and publish anomalies instantly. The system efficiently monitors data streams, detects irregular patterns on the fly, and outputs the anomalies for immediate monitoring and alerting.
Real-time-news-search-engine
A live system that fetches news articles from live news APIs, serializes and streams messages to a Kafka Topic. It then uses Bytewax to streamline the messages from our Kafka Topic by further cleaning, parsing, chunking, embedding, and upserting vectors to a Vector Database which can be accessed by a UI.
build-and-deploy-real-time-feature-pipeline
Python-based real-time feature pipeline that fetches live trade data from the Coinbase Websocket API, transforms the data into OHLC features using Bytewax, and stores these features in the Hopsworks Feature Store. Additionally, it includes a real-time dashboard built with Bokeh and Streamlit for interactive visualization of the final features.
130MIT LicenseView on GitHub
pfund
PFund (/piː fʌnd/) is an algo-trading framework designed for using machine learning models natively to trade across TradFi (Traditional Finance, e.g. Interactive Brokers), CeFi (Centralized Finance, e.g. Binance) and DeFi (Decentralized Finance, e.g. dYdX), or in simple terms, Stocks and Cryptos.
32Apache License 2.0View on GitHub