|
|
Description#8852 - ldap sync output filter
Patch Set 1 #
Total comments: 5
Patch Set 2 : #8852 - ldap sync output filter #
Total comments: 8
Patch Set 3 : #8852 - ldap sync output filter #Patch Set 4 : #8852 - ldap sync output filter #Patch Set 5 : #8852 - ldap sync output filter #
Total comments: 6
Patch Set 6 : #8852 - ldap sync output filter #
Total comments: 1
Patch Set 7 : #8852 - ldap sync output filter #Patch Set 8 : #8852 - ldap sync output filter #
Total comments: 2
Patch Set 9 : #8852 - ldap sync output filter #
Total comments: 3
Patch Set 10 : #8852 - ldap sync output filter #MessagesTotal messages: 22
https://codereview.adblockplus.org/29761555/diff/29761556/edit.sh File edit.sh (right): https://codereview.adblockplus.org/29761555/diff/29761556/edit.sh#newcode1 edit.sh:1: #!/bin/bash please use /bin/sh instead of /bin/bash This way it is portable across POSIX systems. (only if you make the code POSIX complaint of course) https://codereview.adblockplus.org/29761555/diff/29761556/edit.sh#newcode2 edit.sh:2: What is this script about? What is the intention of the script? Please document why are you writing this script so I can understand why this code matters. https://codereview.adblockplus.org/29761555/diff/29761556/edit.sh#newcode3 edit.sh:3: IFS=$'\n' lines=($(cat /dev/stdin)) you may want to look into doing this with `read(2)` https://codereview.adblockplus.org/29761555/diff/29761556/edit.sh#newcode5 edit.sh:5: for (( i=0; i < ${#lines[*]}; i++ )) and while you are looking into the comment above, also look into the other loop you can do here (hint: while loop), what is the difference between them both and why would you use one or the other https://codereview.adblockplus.org/29761555/diff/29761556/edit.sh#newcode7 edit.sh:7: if [[ ${lines[$i]} != *"Updating"* ]] && [[ ${lines[$i]} != *"Found"* ]] && [[ ${lines[$i]} != *"Synchronizing"* ]] && [[ ${lines[$i]} != "#"* ]]; then the double square brackets (`[[`) is a bash feature (http://tiswww.case.edu/php/chet/bash/bashref.html#SEC31) as I said before, we aim to make shell scripts POSIX complaint, so it can run in any POSIX system. then there is this bit "${lines[$i]}" which is not very intuitive IMO, due the way the array is constructed, there are easier and more maintainable ways of doing this. the string comparison can be done in a different way too, with other shell commands, for instance, have you looked into `grep(1)`? (tip: look into its manual page) Always go for 80 (or 79) columns width, this way the code is easier to read, fits nicely into terminals (look for the default value on your linux terminal or the manual pages for example), the code doesn't wrap when you have a smaller screen... there are many reasons to aim for this width. We also aim for not modifying lines that have nothing to do with the change that you might do in the future, so, using your same line as an example, you can do something like: ``` if [[ ${lines[$i]} != *"Updating"* ]] && \ [[ ${lines[$i]} != *"Found"* ]] && \ [[ ${lines[$i]} != *"Synchronizing"* ]] && \ [[ ${lines[$i]} != "#"* ]] \ then ``` I didn't test this, but my point is, if you then need to add another line, or modify the string to compare, you only modify that exact line and not the whole conditional so it is easier to see, also the code is more readable in the end (see, even the 80 columns rule is respected here)
On 2018/05/04 01:50:54, f.lopez wrote: > https://codereview.adblockplus.org/29761555/diff/29761556/edit.sh > File edit.sh (right): > > https://codereview.adblockplus.org/29761555/diff/29761556/edit.sh#newcode1 > edit.sh:1: #!/bin/bash > please use /bin/sh instead of /bin/bash > > This way it is portable across POSIX systems. > (only if you make the code POSIX complaint of course) > > https://codereview.adblockplus.org/29761555/diff/29761556/edit.sh#newcode2 > edit.sh:2: > What is this script about? What is the intention of the script? Please document > why are you writing this script so I can understand why this code matters. > > https://codereview.adblockplus.org/29761555/diff/29761556/edit.sh#newcode3 > edit.sh:3: IFS=$'\n' lines=($(cat /dev/stdin)) > you may want to look into doing this with `read(2)` > > https://codereview.adblockplus.org/29761555/diff/29761556/edit.sh#newcode5 > edit.sh:5: for (( i=0; i < ${#lines[*]}; i++ )) > and while you are looking into the comment above, also look into the other loop > you can do here (hint: while loop), what is the difference between them both and > why would you use one or the other > > https://codereview.adblockplus.org/29761555/diff/29761556/edit.sh#newcode7 > edit.sh:7: if [[ ${lines[$i]} != *"Updating"* ]] && [[ ${lines[$i]} != *"Found"* > ]] && [[ ${lines[$i]} != *"Synchronizing"* ]] && [[ ${lines[$i]} != "#"* ]]; > then > the double square brackets (`[[`) is a bash feature > (http://tiswww.case.edu/php/chet/bash/bashref.html#SEC31) > as I said before, we aim to make shell scripts POSIX complaint, so it can run in > any POSIX system. > > then there is this bit "${lines[$i]}" which is not very intuitive IMO, due the > way the array is constructed, there are easier and more maintainable ways of > doing this. > > the string comparison can be done in a different way too, with other shell > commands, for instance, have you looked into `grep(1)`? > (tip: look into its manual page) > > Always go for 80 (or 79) columns width, this way the code is easier to read, > fits nicely into terminals (look for the default value on your linux terminal or > the manual pages for example), the code doesn't wrap when you have a smaller > screen... there are many reasons to aim for this width. > > We also aim for not modifying lines that have nothing to do with the change that > you might do in the future, so, using your same line as an example, you can do > something like: > > ``` > if [[ ${lines[$i]} != *"Updating"* ]] && \ > [[ ${lines[$i]} != *"Found"* ]] && \ > [[ ${lines[$i]} != *"Synchronizing"* ]] && \ > [[ ${lines[$i]} != "#"* ]] \ > then > ``` > I didn't test this, but my point is, if you then need to add another line, or > modify the string to compare, you only modify that exact line and not the whole > conditional so it is easier to see, also the code is more readable in the end > (see, even the 80 columns rule is respected here) muchas gracias Paco
I notice you don't write to `stdout(3)` (-q flag), but then you `echo(1)` some lines... isn't that the default behavior of grep? Also, I'm not familiar with the LDAP output you are dealing with, please send me (via private message is fine) an example of the output and the expected outcome so I can run some tests as well Thanks! https://codereview.adblockplus.org/29761555/diff/29781563/edit.sh File edit.sh (right): https://codereview.adblockplus.org/29761555/diff/29781563/edit.sh#newcode7 edit.sh:7: ##################################################################### "Filters out repeated lines from LDAP sync cron job output" Or something like that, no need to put `The cause of this script` or whatever, the shorter, the better (unlike my codereviews) Also, no need to add 4 lines of pure `#`. Just one `#` is enough for the actual documentation https://codereview.adblockplus.org/29761555/diff/29781563/edit.sh#newcode11 edit.sh:11: while IFS= read -r line; do This IFS set is redundant here because you are quoting `$line`. Quoting of variables avoid expansions and word splitting, among other things. On a side note, I wonder if the loop is necessary, can't you just pass the whole text? I don't have an example to test this, but maybe you can avoid looping through every line. https://codereview.adblockplus.org/29761555/diff/29781563/edit.sh#newcode12 edit.sh:12: if echo "$line" | grep -q "Updating" || can't you avoid repeating yourself here with the `-e` flag? for example(untested code): ``` if echo "$line" | grep -e "Updating" \ -e "Creating" ``` https://codereview.adblockplus.org/29761555/diff/29781563/edit.sh#newcode18 edit.sh:18: if echo "$line" | grep -q -v "Updating" && You can use the `-F` flag for all lines so you don't repeat yourself, something like this (untested code): ``` patterns="Updating Creating Found # Synchronizing ->" if echo "$line" | grep -q -v -F "${patterns}" ``` The -F flag works here cause it needs to match all of them. https://codereview.adblockplus.org/29761555/diff/29781563/edit.sh#newcode25 edit.sh:25: echo $line my editor shows an empty space at the end of this line, please make sure there is no empty spaces at the end of any line https://codereview.adblockplus.org/29761555/diff/29781563/edit.sh#newcode28 edit.sh:28: if echo "$line" | grep -q -F -e "->" no need for `-F` flag since you are matching a single pattern. no need for `-e` flag since you are not using regex either. https://codereview.adblockplus.org/29761555/diff/29781563/edit.sh#newcode31 edit.sh:31: echo $line Quote both lines here http://www.tldp.org/LDP/abs/html/quoting.html https://codereview.adblockplus.org/29761555/diff/29781563/edit.sh#newcode33 edit.sh:33: done add an empty line at the end of the file
What makes 99% of the complexity in this script is that we don't want ALL "Updating" lines to be filtered out, but only those who are followed by "->" (which means something has been changed). Otherwise a change has been made but we don't know for which user. That's why I thought about an array of lines at the beginning, but since arrays in bourne shell are way more complicated to implement than in bash (eval or sth) I thought about this version which doesn't require an array. Do you have an idea how to differ between the two types of "Updating" lines (necessary and unecessary) in a whole text solution whithout using an array (and therefore iterating it)? I will upload the output file shortly.
Expected output: all "->" lines + their previous line (in order to know for which user the change has been made), all error messages.
On 2018/05/15 08:56:21, l.rosilio wrote: > What makes 99% of the complexity in this script is that we don't want ALL > "Updating" lines to be filtered out, but only those who are followed by "->" > (which means something has been changed). Otherwise a change has been made but > we don't know for which user. That's why I thought about an array of lines at > the beginning, but since arrays in bourne shell are way more complicated to > implement than in bash (eval or sth) I thought about this version which doesn't > require an array. Do you have an idea how to differ between the two types of > "Updating" lines (necessary and unecessary) in a whole text solution whithout > using an array (and therefore iterating it)? > > I will upload the output file shortly. You can use grep `-B` flag to get the line `B`efore the line you wanted to get, which is the use case for the array you are using
On 2018/05/22 01:47:39, f.lopez wrote: > You can use grep `-B` flag to get the line `B`efore the line you wanted to get, > which is the use case for the array you are using BAM
https://codereview.adblockplus.org/29761555/diff/29788555/edit.sh File edit.sh (right): https://codereview.adblockplus.org/29761555/diff/29788555/edit.sh#newcode3 edit.sh:3: # Filtering out uninformative output lines from the ldap-sync cron job script I don't think this script is only for cron jobs, even tho this could be the use case right now, also don't use gerund or infinitive for the description, "Filtering" -> "Filter" https://codereview.adblockplus.org/29761555/diff/29788555/edit.sh#newcode8 edit.sh:8: be consistent here, either always use a blank line between each block, or not, but if you are using blank lines, then it should be one between `esac` and `done`, in this case tho, for readability, I wouldn't use blank lines https://codereview.adblockplus.org/29761555/diff/29788555/edit.sh#newcode10 edit.sh:10: not blank lines here either https://codereview.adblockplus.org/29761555/diff/29788555/edit.sh#newcode14 edit.sh:14: not blank lines between each option either https://codereview.adblockplus.org/29761555/diff/29788555/edit.sh#newcode16 edit.sh:16: echo $last_updated this is a regression, I mentioned in the first code review that, as a general rule, and unless you know what you are doing, you should always quote variables to avoid globbing and word splitting. Please go through the following link http://www.tldp.org/LDP/abs/html/quoting.html https://codereview.adblockplus.org/29761555/diff/29788555/edit.sh#newcode20 edit.sh:20: *Found*|*\#*|*Synchronizing*) We always avoid modifying unrelated changes, in this case, if you need to modify/add/delete a term that don't need special treatment, you will modify the whole line, which would be an unrelated changes, something like: ``` *Found* | \ *\#* | \ *Synchronizing*) ```
https://codereview.adblockplus.org/29761555/diff/29799591/edit.sh File edit.sh (right): https://codereview.adblockplus.org/29761555/diff/29799591/edit.sh#newcode12 edit.sh:12: *-\>*) Is there ever a case where we could encounter multiple `->` lines after another?
https://codereview.adblockplus.org/29761555/diff/29800576/edit.sh File edit.sh (right): https://codereview.adblockplus.org/29761555/diff/29800576/edit.sh#newcode10 edit.sh:10: *Updating*|*Creating*) Please make the patterns a bit more specific, in order to minimize the risk of filtering-out false-positives. Thank you! https://codereview.adblockplus.org/29761555/diff/29800576/edit.sh#newcode14 edit.sh:14: case "$last_action" in It is not a good idea to base the decision whether to print `$last_updated` on a replica of the former condition (`case "$last_action" in ...`). This is repeating the same logic and therefore creating a 2nd point of possible failure for the exact same condition. In this case, for example, you could just set `last_updated=""` (an empty string) after echoing, and `test -z "$last_updated"` to determine whether to print the last line or not. In general, in order to avoid conditional statements being evaluated a 2nd time (which could have side-effects in each run), you can write the result to a variable and just check on that one.
https://codereview.adblockplus.org/29761555/diff/29805564/edit.sh File edit.sh (right): https://codereview.adblockplus.org/29761555/diff/29805564/edit.sh#newcode10 edit.sh:10: case "$last_action" in As discussed, please replace the inner `case` statement here with something along the lines below: if [ ! -z "$last_action" ]; then echo "$last_action" last_action= fi echo "$line" https://codereview.adblockplus.org/29761555/diff/29805564/edit.sh#newcode21 edit.sh:21: *Found*users* | \ How about `"* Found * users *"`? https://codereview.adblockplus.org/29761555/diff/29805564/edit.sh#newcode30 edit.sh:30: continue Not: The `continue` statement is not exactly necessary here.
LGTM. |