Changelog¶

3.4.0 (2023-05-01)¶

upgrade to bootstrap 4.6 (from alpha 4) (#66)
Caution: Some CSS classes changed between Boostram Alpha 4 and 4.6. You might need to upgrade other mara packages as well, e.g. mara-pipelines and mara-app.
add HttpRequest command #78 (#79)
add WriteFile command (#89)
add support for formats in file operations (#95)
add typing (#91)
add before/after task to ParallelTask only when not command list is not empty (#93)
fix get_user_display_name on docker (#90)
fix small issues (#91)
fix SQLAlchemy warning about declarative_base moved in 2.0 (#99)

required changes You might need to investigate your custom CSS styling, see boostram upgrade above.

3.3.0 (2022-09-22)¶

Add option to hide system stats in the UI #72
Add option to disable the collection of system statistics (#72)
Add syntax highlighting for TSQL and SQLite3 (#86)
Support pipeline execution without ‘mara’ database (#71)
Fix getting exitcode from process issue since python 3.8 (#87)
Use client-side rendering for graphviz when shell command is not available (#70)

3.2.0 (2021-03-08)¶

Fix CopyIncrementally with no data (#54)
Add ability to specify modification value type in CopyIncrementally (#53) 66e7dc1 Jan Katins jan.katins@zenjob.com 4. Mar 2021 at 22:06
Fix read stderr during command execution (#47)
Use echo_queries from mara_db.config.default_echo_queries (#58)
Include all versioned package files in wheel

3.1.1 (2020-07-31)¶

Fix for visible passwords in the logs despite mara_pipelines.config.password_masks() set. Bug was introduced in 3.0.0.

3.1.0 (2020-07-21)¶

Modify shell command to support the Google BigQuery integration
Add file_dependencies argument to Python commands

3.0.0 (2020-06-11)¶

Rename package from data_integration to mara_pipelines.

required changes

In requirements.txt, change -e git+https://github.com/mara/data-integration.git@2.8.3#egg=data-integration to -e git+https://github.com/mara/mara-pipelines.git@3.0.0#egg=mara-pipelines
If you use the mara-etl-tools package, update to version 4.0.0
In your project code, replace all imports from data_integration to mara_pipelines
Adapt navigation and ACL entries, if you have any (their names changed from “Data integration” to “Pipelines”)

Here’s an example of how that looks at the mara example project 2: https://github.com/mara/mara-example-project-2/commit/fa2fba148e65533f821a70c18bb0c05c37706a83

2.8.3 (2020-06-10)¶

Fix duplicated system stats if you run multiple ETLs in parallel (#38)
Add config default_task_max_retries (#39)
Cleaner shutdown (#41)

2.8.2 (2020-05-04)¶

Ignore not succeeded executions in cost calculation (#36)
Ensure we log errors via events in case of error/shutdown (#33)
Fix a bug where we reported the wrong error to chat channels when running in the browser and did not restart between failed runs (#33)

2.8.1 (2020-04-27)¶

Fix Problems when frontend and database are in a different timezone (#34)

2.8.0 (2020-03-25)¶

Implement pipeline notifications via Microsoft Teams #28
Make it possible to disable output coloring in command line etl runs (#31)

2.7.0 (2020-03-05)¶

Make event handlers configurable: this allows for e.g. adding your own notifier for specific events
Switch slack to use events for notifications of interactive pipeline runs
Fix an edge case bug where reverting a commit after an error in the table creation for an incremental load job would not recreate the original tables leading to a failed load
Fix an edge case bug where crashing during a triggered (code change, TRUNCATE) full load of an incremental load job after the table was already loaded would not rerun the full load leading to missing data
Optimize how we set the spawning method in multiprocessing

2.6.1 (2020-02-20)¶

Fix for Python 3.7 (“RuntimeError: context has already been set”)

2.6.0 (2020-02-12)¶

Python 3.8 compatibility (explicitly set process spawning method to ‘fork’)
Fix open runs after browser reload
Add workaround for system statistics on wsl1
Speedup incremental insert into partitioned tables
Show warning when graphviz is not installed

2.5.1 (2019-08-01)¶

Include file_dependencies as variable for Copy Commands: This could handle cases in ETL pipeline, where the copy command shall be skipped if the sql_files stay the same.

2.5.0 (2019-07-07)¶

Bug fix: make last modification timestamp of parallel file reading time zone aware (fixes TypeError: can't compare offset-naive and offset-aware datetimes error)

2.4.0 - 2.4.2 (2019-07-04)¶

Add travis integration and PyPi upload

2.3.0 (2019-07-04)¶

Add parameter csv_format and delimiter_char to Copy and CopyIncrementally commands.

2.2.0 (2019-07-02)¶

Changed all TIMSTAMP to TIMSTAMPTZ in the mara tables. You have to manually run the below migration commands as make migrate-mara-db won’t pick up this change.

required changes You need to manually convert the mara tables to TIMESTAMPTZ:

-- Change the timezone to whatever your ETL process is running in
ALTER TABLE data_integration_run ALTER start_time TYPE timestamptz
  USING start_time AT TIME ZONE 'Europe/Berlin';
ALTER TABLE data_integration_run ALTER end_time TYPE timestamptz
  USING end_time AT TIME ZONE 'Europe/Berlin';
ALTER TABLE data_integration_processed_file ALTER last_modified_timestamp TYPE timestamptz
  USING last_modified_timestamp AT TIME ZONE 'Europe/Berlin';
ALTER TABLE data_integration_node_run ALTER start_time TYPE timestamptz
  USING start_time AT TIME ZONE 'Europe/Berlin';
ALTER TABLE data_integration_node_run ALTER end_time TYPE timestamptz
  USING end_time AT TIME ZONE 'Europe/Berlin';
ALTER TABLE data_integration_node_output ALTER timestamp TYPE timestamptz
  USING timestamp AT TIME ZONE 'Europe/Berlin';
ALTER TABLE data_integration_file_dependency ALTER timestamp TYPE timestamptz
  USING timestamp AT TIME ZONE 'Europe/Berlin';
ALTER TABLE data_integration_system_statistics ALTER timestamp TYPE timestamptz
  USING timestamp AT TIME ZONE 'Europe/Berlin';

2.1.0 (2019-05-15)¶

Track and visualize also unfinished pipeline runs
Speed up computation of node durations and node cost
Improve error handling in launching of parallel tasks
Improve run times visualization (better axis labels, independent tooltips)
Smaller ui improvements

2.0.0 - 2.0.1 (2019-04-12)¶

Remove dependency_links from setup.py to regain compatibility with recent pip versions
Change MARA_XXX variables to functions to delay importing of imports
move some imports into the functions that use them in order to improve loading speed
Add ability to mask passwords in Commands, so they cannot show up in the UI anymore or are not written to the database in saved Events (config data_integration.config.password_masks()). See the example in the original function how to not let passwords show up in the settings UI. (gh #14)

required changes

Update mara-app to >=2.0.0

1.4.0 - 1.4.7 (2018-09-15)¶

Use postgresql 10 native partitioning for creating day_id partitions in ParallelReadFile
Catch and display exceptions when creating html command documentation
Add python ParallelRunFunction
Add option to use explicit upsert on incremental load (explicit UPDATE + INSERT)
Emit a proper NodeFinished event when the launching of a parallel task failed
Add option truncate_partition to parallel tasks
Fix bug in run_interactively cli command
Make it possible to run the ExecuteSQL command outside of a pipeline via .run()
Add args parameter to RunFunction command
Show redundant node upstreams as dashed line in pipeline graphs
Fix problems with too long bash commands by using multiple commands for partition generation in ParallelReadXXX tasks

required changes

When using ParallelReadFile with parameter partition_target_table_by_day_id=True, then make sure the target table is natively partitioned by adding PARTITION BY LIST (day_id).

1.3.0 (2018-07-17)¶

Add possibility to continue running child nodes on error (new Pipeline parameters continue_on_error and force_run_all_children)
Make dependency on requests explicit

1.2.0 (2018-06-01)¶

Implement ReadMode ONLY_CHANGED that reads all new or modified files
Show node links in run output only relative to current node (to save space)

1.1.0 (2018-05-23)¶

Add slack notifications to “run_interactively” cli command
Add parameter max_retries to class Task
Fix typos in Readme
Optimize imports

1.0.0 - 1.0.4 (2018-05-02)¶

Move to Github
Improve documentation
Add ReadMode ‘ONLY_LATEST’
Add new command ReadScriptOutput
Add slack bot configuration
Fix url in slack event handler

Changelog¶

3.4.0 (2023-05-01)¶

3.3.0 (2022-09-22)¶

3.2.0 (2021-03-08)¶

3.1.1 (2020-07-31)¶

3.1.0 (2020-07-21)¶

3.0.0 (2020-06-11)¶

2.8.3 (2020-06-10)¶

2.8.2 (2020-05-04)¶

2.8.1 (2020-04-27)¶

2.8.0 (2020-03-25)¶

2.7.0 (2020-03-05)¶

2.6.1 (2020-02-20)¶

2.6.0 (2020-02-12)¶

2.5.1 (2019-08-01)¶

2.5.0 (2019-07-07)¶

2.4.0 - 2.4.2 (2019-07-04)¶

2.3.0 (2019-07-04)¶

2.2.0 (2019-07-02)¶

2.1.0 (2019-05-15)¶

2.0.0 - 2.0.1 (2019-04-12)¶

1.4.0 - 1.4.7 (2018-09-15)¶

1.3.0 (2018-07-17)¶

1.2.0 (2018-06-01)¶

1.1.0 (2018-05-23)¶

1.0.0 - 1.0.4 (2018-05-02)¶

Mara Pipelines

Navigation

Related Topics