Do we need “schedule ETL data service” on cloud?

SamSam
2 min readOct 7, 2019

--

For lazy people conclusion: Yes.

But why?

I do have my local service machine (working station) for my schedule work and master data management. To me, cloud schedule service is just an option. My database and data source both on cloud.

I thought there would be no difference between using schedule on cloud and using schedule on local. It both triggers. The service always on cloud.

But I was wrong.

Actually, once we set the schedule, they still need to connect both sides of data, from source to target. The connection of network still costs time between cloud and local.

Don’t underestimate it. Really.

The test is quite simple so i wont show any process of doing this. I just showed the result that we need to be aware of “schedule on cloud”.

Local scheduler:

Cloud scheduler:

16 times faster. End.

PS
Example for this concept. Just the sample from AWS docs. Have a nice day.

Originally published at http://datamansamxiao.wordpress.com on October 7, 2019.

--

--

SamSam
SamSam

Written by SamSam

用有限的資料知識探索無限的世界

No responses yet