Asynchronous Data Streaming with HTTP+JSON .NET and JavaScript example

Asynchronous Data Streaming with HTTP+JSON – .NET and JavaScript Example

Table of Contents:

  1. What is the definition of data streaming?
  2. Abstract HTTP endpoint design.
  3. Pros and cons of the streaming approach.
  4. Alternatives to the streaming approach.
  5. Where can the streaming approach be used?
  6. Demo application.
  7. FAQ
  8. TL;DR

 

Definition of data streaming

AI automation for workflows Asynchronous data streaming is a technology paradigm where data is continuously produced, ingested, and processed in real time. It allows for immediate analysis and action upon data as it’s generated, rather than requiring a complete dataset before processing can begin.

 

How do we understand it?

In the context of asynchronous data streaming with HTTP+JSON, we understand it as follows:

  • Continuous Generation: Data streams are composed of a sequence of data that is generated or gathered over time from various sources, which could include sensors, user interactions, transactions, databases, or services.
  • Incremental Processing: Unlike batch processing, which requires all data to be available upfront, streaming data is processed incrementally. This means that as each piece of data arrives, it is immediately processed, which is essential for time-sensitive applications.
  • Applicability Beyond Media: While often associated with video or audio streaming, the concept of data streaming extends to any sizable, continuously updated data source. This includes large JSON arrays transmitted over HTTP, where each array element can be thought of as a ‘chunk’ of the data stream.

By leveraging asynchronous communication and JSON formatting, data streaming over HTTP can facilitate real-time data flow between servers and clients, enabling dynamic web applications and services that respond promptly to new information.

 

Examples of data streams

The concept of data streaming is not limited to any single technology or platform. Here are some examples:

  • .NET Streams: In .NET 8+, streams represent sequences of bytes. Common types include MemoryStream for storing data in memory and FileStream for reading from or writing to files.
  • RxJS Streams: RxJS, or Reactive Extensions for JavaScript, provides an API for asynchronous programming with observable streams. It allows for composing and querying data streams with a rich set of operators.
  • Operating System Streams: These are low-level streams provided by the operating system, such as network streams, which handle data transmission over a network.

Each of these streams serve a different purpose but share the common trait of handling data in a sequential, flowing manner, which is the essence of data streaming.

 

Where can the streaming approach be used?

The streaming approach to data handling is versatile and can be applied across a multitude of industries and scenarios. Beyond the travel industry and social media feeds, here are more examples where streaming is beneficial:

  • Financial Services: In stock trading platforms, streaming is used to deliver real-time market data, allowing traders to make informed decisions based on the latest information.
  • Gaming: Online multiplayer games use streaming to synchronize game state information between the server and multiple clients, ensuring a seamless gaming experience.
  • Healthcare: Streaming can be employed in telemedicine platforms for live patient monitoring, where continuous data flow is critical for patient care.
  • IoT Devices: Internet of Things (IoT) devices, such as sensors and smart home products, rely on streaming to send real-time data to servers for processing and analysis.
  • Media and Entertainment: Streaming is the backbone of video and music streaming services, providing users with immediate access to content.
  • E-commerce: Real-time inventory updates and customer interactions in e-commerce platforms can be facilitated through streaming.
  • Logistics and Transportation: Streaming is used for tracking shipments and vehicles in real-time, optimizing routes and delivery schedules.

The streaming approach demonstrates its versatility and critical role in enabling real-time, efficient, and uninterrupted data flow across diverse industries. Beyond the previously mentioned use cases, there is potential to delve into more intricate technical scenarios:

Consider the use of HTTP+JSON data streaming in situations where you are the recipient of data from a HTTP or gRPC source, and the volume of records per request is uncertain. Moreover, the number of records may vary with each subsequent request, rendering pagination impractical. In such unpredictable circumstances, data streaming emerges as the ideal solution. For instance, a flight booking portal or an offers aggregator like Google Flights or Booking.com could implement data streaming to handle fluctuating data volumes efficiently.

 

Abstract HTTP endpoint design

When structuring a streaming HTTP endpoint, the goal is to ensure reliability and robust error handling. Here’s how we can achieve this:

 

How do we structure our streaming endpoint?

  • Each JSON array object is treated as a container that holds both data and errors. This approach allows for a more granular error handling where issues can be addressed individually within each data segment.
  • If an error occurs during data retrieval or processing on the server, it is returned within the aforementioned container. This method avoids the need to buffer the entire response and simplifies the error reporting process.
  • Despite the presence of errors, the response maintains an array format. Data items are encapsulated within containers that store both the data and any associated errors.
  • Traditionally, HTTP responses would return status codes such as 200, 4xx, or 5xx to indicate the success or failure of a request. However, with streaming endpoints, the response is typically a 200 or 207 status code, and the client is responsible for handling errors contained within the data stream.

 

Is a HTTP Streaming Endpoint a REST Endpoint?

A HTTP streaming endpoint does not conform to the traditional REST architectural style. This is because it can return a 200 OK or 207 Multi-Status response even when there are errors in the payload. To efficiently use this endpoint, clients must be aware of its streaming nature and design it to handle the continuous flow of data and errors.

 

How do we structure our endpoint clients?

When designing clients for HTTP streaming endpoints, it is essential to ensure they are equipped to handle the intricacies of the data streaming endpoint. A client, whether it operates within a web browser, serves as a console application, or functions as another backend service, must possess the capability to execute HTTP requests. Additionally, it should be able to deserialize the HTTP network data stream, which is typically encoded in UTF-8 JSON format. As the data stream is received, the client must also process it token by token. This granular approach allows the client to convert these tokens into objects that can be displayed, stored, or processed to extract meaningful information. Moreover, the client must be prepared to manage errors within the result containers. Effective error handling is crucial, as it ensures the client application remains operational and responsive, even when encountering problematic data segments.

 

Pros and cons of the streaming approach

When considering asynchronous data streaming, it’s essential to weigh its advantages and drawbacks to understand its applicability thoroughly.

Pros:

  • Mixed Response Handling: Data streaming facilitates a mixed response model, allowing both items and errors to be transmitted seamlessly.
  • Large Payloads: It is well-suited for transferring large payloads, which might otherwise be cumbersome with traditional methods.
  • Asynchronous Server Calls: The server can handle calls asynchronously, improving throughput and efficiency.
  • Client-Controlled Termination: Clients gain the flexibility to terminate the stream as needed, providing better control over the data flow.
  • Quick First Item Delivery: The first item’s delivery is expedited, offering a faster response time compared to batched list responses.
  • Memory Efficiency: On the server side, streaming is more memory-efficient, and depending on the client’s implementation, it can also enhance memory usage on the client side.

Cons:

  • Non-standard RESTful Behavior: Streaming endpoints may return errors within a 200 OK response, deviating from standard REST practices.
  • Client-Side Error Handling: Clients are responsible for managing failures, as error codes (4xx or 5xx) are not typically returned by the server.
  • Awareness Requirement: Clients must be aware of the streaming nature of the server to handle the data appropriately.
  • Complex Payload Logging: Logging the payload is more challenging, and regular buffering techniques may interfere with streaming capabilities.
  • Partial Data View: Until the entire data set is transferred, clients only have a partial view, which may not reflect the complete picture.

In summary, while the streaming approach offers significant benefits in terms of efficiency and control, it also imposes additional considerations on the client side, particularly regarding error handling and data completeness.

 

 

Alternatives to the streaming approach.

While asynchronous data streaming offers numerous benefits, there are alternative methods worth considering, each with its own set of advantages and limitations.

Returning Big Arrays:

Pros:

  • Complete Data Delivery: Ensures all data is delivered at once, which may be necessary for certain applications.
  • REST API Compliance: Adheres to REST API standards, making it a familiar approach for developers.
  • Simplicity: Offers a more straightforward implementation compared to paginated outputs.
  • Cost-Effectiveness: Potential to keep costs down due to its simplicity.

Cons:

  • High Memory Usage: This can lead to significant memory consumption on the server.
  • Scalability Issues: Response times can become prohibitively slow with large data sets, affecting scalability.

 

Paging:

Pros:

  • Standardized: Follows REST API conventions.
  • Network Efficiency: Limits results per API query, controlling network traffic.
  • Scalability and Customization: Allows users to specify data volume and pagination.

Cons:

  • Partial Data: Only provides snapshots of data, not the entire set.
  • Dynamic Data Challenges: This may not be ideal for scenarios with dynamic or variable data retrieval.
  • Dependency: Requires data suppliers to support paging mechanisms.

 

gRPC:

Pros:

  • Efficient Payloads: Binary format reduces payload size.
  • Language Agnostic: Compatible with all modern programming languages.
  • Versatility: Supports both unary and streaming calls, including client-side streaming.

Cons:

  • Protocol Buffers: Necessitates the use of .proto files.
  • Non-Human Readable: Binary payloads are not easily interpretable by humans.
  • Browser Support: Lacks native support in web browsers.

 

Queuing Systems (e.g., Rabbit MQ, Azure Service Bus, Amazon SQS):

Pros:

  • Scalability: Enhances the ability of an API service to manage numerous requests efficiently.
  • Asynchronous Processing: Allows the API to respond immediately while processing occurs in the background.
  • Fault Tolerance: Facilitates retries in case of failures, improving reliability.
  • Load Balancing: Distributes workload across multiple service instances.

Cons:

  • Complexity: Adds layers of complexity to the API service infrastructure.
  • Potential Latency: This may introduce additional latency during request processing.
  • Cost Considerations: This could incur higher costs depending on the scale and queue system used.
  • Communication Limitations: Primarily suitable for backend-to-backend interactions.

 

Each alternative presents a different approach to handling data delivery, with trade-offs between completeness, performance, and complexity. The choice of method will largely depend on the specific requirements and constraints of the application in question.

 

Demo application

In this segment, we explore a demo application created for the Zartis tech community, which embodies a real-life scenario by featuring a web API endpoint that simulates hotel offers. This proof of concept leverages the abstract server and client concepts previously discussed, providing a practical demonstration of asynchronous data streaming.

 

Technical Aspects

Before we begin a deep dive into the application’s code there are a few technical things that you need to know:

  • The source code for the demo application is hosted on GitHub here: https://github.com/abinkowski94/json-streaming-example
  • The server is built using .NET 8, showcasing the latest advancements in the framework.
  • The application includes two distinct clients:
    • A .NET 8 console application, demonstrating a streamlined, backend-focused interaction.
    • A vanilla JavaScript web application, emphasizing front-end simplicity and efficiency, with oboe.js for stream parsing.

 

The repository also contains some shared elements/concepts that are used both on the server and on the client applications:

  • Common Contracts: A shared csproj connects the .NET client and server, ensuring uniformity in web API models and contracts.
  • Asynchronous Streaming: Both the server and client utilize IAsyncEnumerable<T> to fully engage asynchronous streaming capabilities.
  • Serialization: The System.Text.Json serializer is employed for both serialization and deserialization of the data stream as IAsyncEnumerable<T>.

 

Server Architecture and Features 

The server is designed to interface with three distinct data providers: an SQL database, a CSV file, and an in-memory data generator. Each provider is compatible with the IAsyncEnumerable<T> interface, facilitating data streaming capabilities. The central system integrates these individual streams into a unified data flow. Furthermore, the server endpoint offers customization options such as setting an error rate ranging from 0 to 1 and capping the output to a specified number of elements. Additionally, users can choose how to combine the three data streams, with ‘concat’ as the default method and ‘zip’ as an alternative for interleaving data from different sources.

JSON stream for asynchronous streaming

JSON stream for asynchronous streaming demo application

 

Console Application 

The .NET console application is straightforward, relying solely on the standard .NET SDK and a reference to the contracts csproj, without any external NuGet packages. It generates a HTTP client to execute a basic HTTP GET request. A notable feature is the HttpCompletionOption.ResponseHeadersRead flag, which prompts the client to issue the HttpResponseMessage immediately after reading the headers, avoiding full content buffering. This approach grants access to the unbuffered network stream, enabling item-by-item processing as data arrives. The System.Text.Json library further enhances this with its DeserializeAsyncEnumerable<T> method, allowing for the conversion of incoming data into an IAsyncEnumerable<T> structure.

asynchronous streaming

Web Browser Application

The web browser application employs vanilla JavaScript and leverages the oboe.js library for executing requests. Oboe.js specializes in object-by-object stream deserialization and invokes a callback upon each object’s readiness. The interface presents three straightforward settings: ‘Max results count’ to limit object returns, ‘Error chance’ to set the likelihood of simulated errors on a scale from 0 to 1, and ‘Mix data source’ to toggle between concatenating or zipping the data streams from the mentioned providers.

asychronous data streaming with json

Hands-On Exploration

For a hands-on experience, I suggest downloading the repository to explore the application firsthand. The prerequisites are minimal, requiring only the .NET 8 SDK, or newer, and a code editor like Visual Studio, Visual Studio Code, or Rider. Placing breakpoints throughout the code can offer insightful observations into the streaming process and its management.

 

FAQ

  • How can I configure the application to work on a different port? 
    • Unfortunately, it is hardcoded for now. Maybe someday in the future, I’ll add some better local host orchestration for this project. So my suggestion is to stick to the hardcoded port or replace it in 3 places (server properties, .net client code, and js file)
  • CORS
    • The backend API is configured to accept any request in development mode
  • HTTP/2
    • To work the application requires a HTTP/2 protocol version. I’ve not tried it with HTTP/3.
  • How to set up the database?
    • The simple answer is you do not have to. The demo application already uses a SQLite database that is stored in this GitHub repository.

 

TL;DR

The article discusses asynchronous data streaming with HTTP+JSON, a technology paradigm where data is continuously produced, ingested, and processed in real-time. It covers the concept of data streaming, examples of data streams, the design of a streaming HTTP endpoint, and the pros and cons of the streaming approach. The article also explores alternatives to the streaming approach, such as returning big arrays, paging, gRPC, and queuing systems. It highlights the broad applicability of the streaming approach across various industry sectors. The article concludes with a demo application created for the Zartis tech community, which features a web API endpoint that simulates hotel offers. This application serves as a practical demonstration of asynchronous data streaming. The source code for the demo application is hosted on GitHub. The server is built using .NET 8, and the application includes two distinct clients: a .NET 8 console application and a vanilla JavaScript web application. The server is designed to interface with three distinct data providers: an SQL database, a CSV file, and an in-memory data generator. The server endpoint offers customization options such as setting an error rate and capping the output to a specified number of elements. The console application uses a HTTP client to execute a basic HTTP GET request, while the web browser application employs vanilla JavaScript and leverages the oboe.js library for executing requests.

 

Author:

Augustyn is a seasoned Software Engineer with an MSc in Computer Science and years of experience in the design and development of applications. He possesses a profound knowledge and skill set in constructing web applications, API backend systems, and database development. Throughout his career, Augustyn has demonstrated proficiency in working with frameworks such as .NET (Framework, Core, .NET 6+), Angular, and Vue.js. Moreover, he has amassed considerable expertise in Azure cloud technology. His professional aspiration is to continuously enhance his skills and ascend to the role of Software Architect.

 

Share this post

Do you have any questions?

Zartis Tech Review

Your monthly source for AI and software news