Skip to content
View ismaildawoodjee's full-sized avatar

Block or report ismaildawoodjee

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. vulnerability-research vulnerability-research Public

    Vulnerability research

  2. GreatEx GreatEx Public

    A project for exploring how Great Expectations can be used to ensure data quality and validate batches within a data pipeline defined in Airflow.

    Python 25 7

  3. aws-data-pipeline aws-data-pipeline Public

    A batch processing data pipeline, using AWS resources (S3, EMR, Redshift, EC2, IAM), provisioned via Terraform, and orchestrated from locally hosted Airflow containers. The end product is a Superse…

    Python 24 6

  4. nifi-data-pipeline nifi-data-pipeline Public

    📦 Containerizing everything from the book "Data Engineering with Python"

    Python 1 1

  5. product-catalogue product-catalogue Public

    Web scraping task to obtain industrial equipment data and produce a product catalogue out of it. From this page:

    Python 1

  6. spark-big-data spark-big-data Public

    Spark with Scala. Big data project to analyze 35 GB Parquet data (~400 GB as decompressed CSV) and extract business insights from it

    Scala 2