automated diff through consolidated cleaned version of the jsons ?
I have a bit of free time for the next 15 days will try to work again a bit on the datas provided by the AN.
As mentionned in another issue i began to work on automated processing on the data over ansible
reusing in the playbooks the scripts that have been developped so far.
While looking at the output of the datas i can see that the cleaning process creates new version of the splitted files. However i'm not sure that at some point that there's a script that remerges the whole cleaned data into a single file ( like a cleaned amendements_XV.json for example ).
The main point of that would be to then use jq
in order to extract json diffs from a previous dump of the data with a new one.
The utility of that would be to allow light update onto a database that would be fed by the successive diffs.
Any idea or suggestion onto the approach to do that ?
EDIT :
i can see that for some files, like scrutins
we get a whole consolidated cleaned version. Is there such thing for amendements
?
So far this is the list i get after running the current process ( see additional below note ) :
-rw-r--r-- 1 gitlab-runner gitlab-runner 17483083 Oct 22 14:00 Agenda_XIV.json
-rw-r--r-- 1 gitlab-runner gitlab-runner 956779 Oct 22 14:00 Agenda_XIV.json.zip
drwxr-xr-x 5 gitlab-runner gitlab-runner 4096 Oct 22 15:09 Agenda_XIV_nettoye
drwxr-xr-x 5 gitlab-runner gitlab-runner 4096 Oct 22 14:31 Agenda_XV
drwxr-xr-x 3 gitlab-runner gitlab-runner 4096 Oct 22 13:53 Agenda_XV.json
-rw-r--r-- 1 gitlab-runner gitlab-runner 25192623 Oct 22 13:53 Agenda_XV.json.zip
drwxr-xr-x 5 gitlab-runner gitlab-runner 4096 Oct 22 15:08 Agenda_XV_nettoye
drwxr-xr-x 372 gitlab-runner gitlab-runner 20480 Oct 22 14:44 Amendements_XV
drwxr-xr-x 183 gitlab-runner gitlab-runner 12288 Oct 22 14:01 Amendements_XV.json
-rw-r--r-- 1 gitlab-runner gitlab-runner 237157338 Oct 22 14:01 Amendements_XV.json.zip
drwxr-xr-x 372 gitlab-runner gitlab-runner 20480 Oct 22 15:26 Amendements_XV_nettoye
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 14:18 AMO10_deputes_actifs_mandats_actifs_organes_XV
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 13:37 AMO10_deputes_actifs_mandats_actifs_organes_XV.json
-rw-r--r-- 1 gitlab-runner gitlab-runner 1526349 Oct 22 13:37 AMO10_deputes_actifs_mandats_actifs_organes_XV.json.zip
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 14:50 AMO10_deputes_actifs_mandats_actifs_organes_XV_nettoye
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 14:20 AMO20_dep_sen_min_tous_mandats_et_organes_XIV
-rw-r--r-- 1 gitlab-runner gitlab-runner 27168925 Oct 22 13:37 AMO20_dep_sen_min_tous_mandats_et_organes_XIV.json
-rw-r--r-- 1 gitlab-runner gitlab-runner 922349 Oct 22 13:37 AMO20_dep_sen_min_tous_mandats_et_organes_XIV.json.zip
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 14:53 AMO20_dep_sen_min_tous_mandats_et_organes_XIV_nettoye
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 14:19 AMO20_dep_sen_min_tous_mandats_et_organes_XV
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 13:37 AMO20_dep_sen_min_tous_mandats_et_organes_XV.json
-rw-r--r-- 1 gitlab-runner gitlab-runner 2578774 Oct 22 13:37 AMO20_dep_sen_min_tous_mandats_et_organes_XV.json.zip
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 14:51 AMO20_dep_sen_min_tous_mandats_et_organes_XV_nettoye
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 14:20 AMO30_tous_acteurs_tous_mandats_tous_organes_historique
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 13:37 AMO30_tous_acteurs_tous_mandats_tous_organes_historique.json
-rw-r--r-- 1 gitlab-runner gitlab-runner 6720718 Oct 22 13:37 AMO30_tous_acteurs_tous_mandats_tous_organes_historique.json.zip
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 14:58 AMO30_tous_acteurs_tous_mandats_tous_organes_historique_nettoye
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 14:23 AMO40_deputes_actifs_mandats_actifs_organes_divises_XV
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 13:38 AMO40_deputes_actifs_mandats_actifs_organes_divises_XV.json
-rw-r--r-- 1 gitlab-runner gitlab-runner 2810294 Oct 22 13:38 AMO40_deputes_actifs_mandats_actifs_organes_divises_XV.json.zip
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 15:00 AMO40_deputes_actifs_mandats_actifs_organes_divises_XV_nettoye
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 14:24 AMO50_acteurs_mandats_organes_divises_XV
drwxr-xr-x 5 gitlab-runner gitlab-runner 4096 Oct 22 13:42 AMO50_acteurs_mandats_organes_divises_XV.json
-rw-r--r-- 1 gitlab-runner gitlab-runner 28544736 Oct 22 13:38 AMO50_acteurs_mandats_organes_divises_XV.json.zip
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 15:03 AMO50_acteurs_mandats_organes_divises_XV_nettoye
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 14:45 Dossiers_Legislatifs_XIV
-rw-r--r-- 1 gitlab-runner gitlab-runner 67256038 Oct 22 14:17 Dossiers_Legislatifs_XIV.json
-rw-r--r-- 1 gitlab-runner gitlab-runner 2570692 Oct 22 14:17 Dossiers_Legislatifs_XIV.json.zip
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 15:29 Dossiers_Legislatifs_XIV_nettoye
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 14:44 Dossiers_Legislatifs_XV
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 14:17 Dossiers_Legislatifs_XV.json
-rw-r--r-- 1 gitlab-runner gitlab-runner 5368155 Oct 22 14:17 Dossiers_Legislatifs_XV.json.zip
drwxr-xr-x 4 gitlab-runner gitlab-runner 4096 Oct 22 15:26 Dossiers_Legislatifs_XV_nettoye
drwxr-xr-x 3 gitlab-runner gitlab-runner 4096 Oct 22 14:48 Scrutins_XIV
-rw-r--r-- 1 gitlab-runner gitlab-runner 27094859 Oct 22 14:18 Scrutins_XIV.json
-rw-r--r-- 1 gitlab-runner gitlab-runner 671420 Oct 22 14:18 Scrutins_XIV.json.zip
drwxr-xr-x 3 gitlab-runner gitlab-runner 4096 Oct 22 15:33 Scrutins_XIV_nettoye
drwxr-xr-x 3 gitlab-runner gitlab-runner 4096 Oct 22 14:47 Scrutins_XV
-rw-r--r-- 1 gitlab-runner gitlab-runner 66078065 Oct 22 15:34 Scrutins_XV_fusionne.json
drwxr-xr-x 2 gitlab-runner gitlab-runner 86016 Oct 22 14:17 Scrutins_XV.json
-rw-r--r-- 1 gitlab-runner gitlab-runner 4327953 Oct 22 14:17 Scrutins_XV.json.zip
drwxr-xr-x 3 gitlab-runner gitlab-runner 4096 Oct 22 15:31 Scrutins_XV_nettoye
additional note : The process crashes onto the mv, this is because i'm going step by step writing the playbook and in our current case just was making a test of the mv which revealed to be bad approach. ( preparing to use fileglob copy instead )