Book
  • Introduction
  • Welcome !!
  • Chapter 1: The mobile ecosystem
    • Fragmentation is the devil
    • There is more than one type of mobile app
    • ... more than one type of app
    • ... one type of app
    • Under pressure (ee da de da de) !!
    • Further reading!!
  • Chapter 2: Let's start with design thinking
    • A taste of design thinking
    • The five steps
    • Design for everybody
    • Accessibility in mobile apps
  • Chapter 3: Give me a context and I will give you an app
    • Users
    • Personas? Users ? What is the difference?
    • Please, help me to model the context
    • The context canvas
  • Chapter 4: Powerful models
    • Data architecture is the foundation of analytics
    • From data to information and knowledge
    • Information/Knowledge in our mobile ecosystem
    • Questions to ask yourselves when building and classifying questions
    • The visualization-data map
    • On the scene: describing how personas interact with your app
  • Chapter 5: A GUI is better than two thousand words
    • 'Good to Go:' Let's explore the Design Systems
    • Designing GUI Mocks
    • No prototype... no deal
  • Chapter 6: About mobile operating systems ... and other deamons
    • The Android OS ... son of LINUX
    • iOS son of Darwin? or is it iOS son of UNIX?
    • Kernels
  • Chapter 7: Yes, software architecture matters !!
    • Self-test time
    • About design and design constraints
    • Architects' mojo: styles and patterns
    • What you need is a tactic !!
    • Self-test time 2 (for real)
    • Further reading
  • Chapter 8: Finally... coding
    • MVC, MVVM, MV*, MV...What?
    • Programming models: the Android side
    • Hello Jetpack, my new friend... An Android Jetpack Introduction
    • Programming models: the iOS side
    • Controllers and more controllers
    • Flutter son of... simplicity
    • Programming models: Flutter?
    • Flutter: State matters... Let´s start simple
    • Flutter: State matters... Complex stuff ahead
    • Micro-optimizations
  • Chapter 9: Data pipeline
    • Generalities data pipelines
    • Data storage types
    • Types of data pipelines
  • Chapter 10: Error Retrieving Chapter 10
    • Eventual Connectivity on Mobile Apps
    • How to handle it on Android
  • Chapter 11: The jewel in the crown: Performance
    • As fast as a nail
    • Memory bloats
    • Energy leaks
    • Final thoughts
  • Chapter 12. Become a performance bugs exterminator
    • Weak or strong?
    • Micro-optimizations
    • The single thread game !!
    • Using multi-threading like a boss !!
    • Caching
    • Avoiding memory bloats
    • Further readings
Powered by GitBook
On this page
  1. Chapter 9: Data pipeline

Generalities data pipelines


A data pipeline could be seen as layers. And each layer is the input of the next layer. We will not deepen in the explanation of each layer, but we will mention only the most common layers that a data pipeline could have.

  • Data source: this is the foundation of your pipeline: all your raw data live in this layer. Your data can be structured or unstructured data. This layer can be compound from multiple elements. Relational, NoSQL, plain files, etc.., For example:

    • MySQL

    • Maria db

    • Google cloud storage

    • Firebase

    • RDS

    • XML files

    • Apps

    • Werables

  • Ingestion and integration layer: this is layer is capable of reading the data from data sources into data processing. In this layer you load the data in a targeted storage, giving it a format that the rest of your pipeline its capable of understanding.

    • REST/MQTT endpoints

    • Message queue

    • Firebase rest API

    • SFTP

  • Storage layer: this layer is responsible for saving the data. We can have NoSQL and SQL databases. We will focus on this in the next subchapter since this is an important concept for your applications.

    • SQL databases

    • No SQL databases

  • Processing/computation layer: this layer is used for doing aggregation, mix data sources, and pre-calculate data to use it in the next layer for visualization. This layer can be used for streaming or batch processing. (Here is where our analytics engine reside, but needs a stable storage layer and a good presentation layer)

    • Self hosted scripts (e.g., Python script, SQL scripts, etc..,)

    • Storm

    • Apache Spark

    • Flink

    • Machine learning models

    • Crashlytics

  • Presentation layer: this layer presents the insights through dashboards, emails, SMSs, push notifications, and more. Take into account that, generally, machine learning models are exposed as Micro-services.

    • Quicksight Amazon

    • Metabase

    • Apache Superset

    • Tableu

    • Looker

    • Realtime dashboard

    • Zoomdata

In the next following two chapters, we are going to go deeper into the storage layer and present to you a brief explanation of the popular types of data pipelines.

PreviousChapter 9: Data pipelineNextData storage types

Last updated 1 year ago