|
|
Created:
Aug. 28, 2018, 8:35 a.m. by tlucas Modified:
Sept. 6, 2018, 5:30 p.m. CC:
sergei Base URL:
https://codereview.adblockplus.org/29862580/ Visibility:
Public. |
DescriptionIssue 6890 - run npm tests in parallel
Prerequisites:
#6651
#6887
Patch Set 1 #
Total comments: 13
Patch Set 2 : #Patch Set 3 : #
Total comments: 6
Patch Set 4 : PS4 - rename "tests:global" to "lint" #
Total comments: 9
Patch Set 5 : #
Total comments: 3
Patch Set 6 : #MessagesTotal messages: 16
https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml#newc... .gitlab-ci.yml:25: - python ensure_dependencies.py I wonder whether this should rather go in pretest in package.json? The advantage would be that we'd also make sure the dependencies are there and up to date when running "npm test" manually during development. https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml#newc... .gitlab-ci.yml:42: qunit:gecko: That the only thing currently performed is running the qunit tests is matter to change. So I wouldn't name the job qunit:*, or do you plan to have a separate job to evaluate the test pages? I might rather not, since then we have to redundantly start the browser, initialize the driver, and install the extension. Also we might want to have a separate job for ESlint. Otherwise, we'd redundantly run ESLint for each job, and if it fails each job will fail.
Patch Set 2: * Call "npm install" again, to be sure https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml#newc... .gitlab-ci.yml:25: - python ensure_dependencies.py On 2018/08/28 11:22:55, Sebastian Noack wrote: > I wonder whether this should rather go in pretest in package.json? The advantage > would be that we'd also make sure the dependencies are there and up to date when > running "npm test" manually during development. Since "python build.py" - which is called anyway in the current pretest script / in the mocha tests, which are yet to be landed - calls ensure_dependencies.py anyway we are sure, that everything is up to date. Furthermore, this implies that in the actual jobs below we are also sure that dependencies are installed and up to date. There is, unfortunately, no guarantee, that caching will succeed in time for subsequent jobs. Which brings me to: i should call "npm install" again in the jobs below, changed that. https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml#newc... .gitlab-ci.yml:42: qunit:gecko: On 2018/08/28 11:22:55, Sebastian Noack wrote: > That the only thing currently performed is running the qunit tests is matter to > change. So I wouldn't name the job qunit:*, or do you plan to have a separate > job to evaluate the test pages? I might rather not, since then we have to > redundantly start the browser, initialize the driver, and install the extension. > > Also we might want to have a separate job for ESlint. Otherwise, we'd > redundantly run ESLint for each job, and if it fails each job will fail. No plans for jobs per tests, so changed the naming to tests:*. However, the latter has to be prepared in package.json (i.e. take "eslint" out out "posttest").
https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml#newc... .gitlab-ci.yml:25: - python ensure_dependencies.py On 2018/08/28 12:21:59, tlucas wrote: > On 2018/08/28 11:22:55, Sebastian Noack wrote: > > I wonder whether this should rather go in pretest in package.json? The > advantage > > would be that we'd also make sure the dependencies are there and up to date > when > > running "npm test" manually during development. > > Since "python build.py" - which is called anyway in the current pretest script / > in the mocha tests, which are yet to be landed - calls ensure_dependencies.py > anyway we are sure, that everything is up to date. > > Furthermore, this implies that in the actual jobs below we are also sure that > dependencies are installed and up to date. There is, unfortunately, no > guarantee, that caching will succeed in time for subsequent jobs. Which brings > me to: i should call "npm install" again in the jobs below, changed that. After your changes running the tests as well on Chrome, build.py isn't invoked by pretest anymore, but from within the test suite. The test suite, however, imports code from adblockpluscore. https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml#newc... .gitlab-ci.yml:42: qunit:gecko: On 2018/08/28 12:21:59, tlucas wrote: > However, the latter has to be prepared in package.json (i.e. take "eslint" out > out "posttest"). I know, go ahead. :)
https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml#newc... .gitlab-ci.yml:25: - python ensure_dependencies.py On 2018/08/28 12:50:57, Sebastian Noack wrote: > On 2018/08/28 12:21:59, tlucas wrote: > > On 2018/08/28 11:22:55, Sebastian Noack wrote: > > > I wonder whether this should rather go in pretest in package.json? The > > advantage > > > would be that we'd also make sure the dependencies are there and up to date > > when > > > running "npm test" manually during development. > > > > Since "python build.py" - which is called anyway in the current pretest script > / > > in the mocha tests, which are yet to be landed - calls ensure_dependencies.py > > anyway we are sure, that everything is up to date. > > > > Furthermore, this implies that in the actual jobs below we are also sure that > > dependencies are installed and up to date. There is, unfortunately, no > > guarantee, that caching will succeed in time for subsequent jobs. Which brings > > me to: i should call "npm install" again in the jobs below, changed that. > > After your changes running the tests as well on Chrome, build.py isn't invoked > by pretest anymore, but from within the test suite. The test suite, however, > imports code from adblockpluscore. Gotya. I moved "python ensure_dependencies.py" to "pretest" accordingly - as well as to the tests' local before_scripts. (IMHO moving "ensure_dependencies.py" into the "pretest" script is rather a workaround for local dependency management - and in CI i don't want to have that management scattered all over the place / hidden in some package.json files. and FWIW, since both "npm install" and "ensure_dependencies" are essentially noops, IMO this doesn't hurt.) https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml#newc... .gitlab-ci.yml:42: qunit:gecko: On 2018/08/28 12:50:57, Sebastian Noack wrote: > On 2018/08/28 12:21:59, tlucas wrote: > > However, the latter has to be prepared in package.json (i.e. take "eslint" out > > out "posttest"). > > I know, go ahead. :) Done.
Other than that LGTM https://codereview.adblockplus.org/29867566/diff/29868555/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29868555/.gitlab-ci.yml#newc... .gitlab-ci.yml:28: key: cache_$CI_COMMIT_SHA IDEA: Have you thought about using a hash of the dependencies as a cache key instead of the commit hash i.e. hash dependencies, add python dependencies to the repository via a `dependencies.txt` or similar (like you had in previous versions) and whatever else needs to be considered?
https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml#newc... .gitlab-ci.yml:25: - python ensure_dependencies.py On 2018/08/29 08:14:27, tlucas wrote: > On 2018/08/28 12:50:57, Sebastian Noack wrote: > > On 2018/08/28 12:21:59, tlucas wrote: > > > On 2018/08/28 11:22:55, Sebastian Noack wrote: > > > > I wonder whether this should rather go in pretest in package.json? The > > > advantage > > > > would be that we'd also make sure the dependencies are there and up to > date > > > when > > > > running "npm test" manually during development. > > > > > > Since "python build.py" - which is called anyway in the current pretest > script > > / > > > in the mocha tests, which are yet to be landed - calls > ensure_dependencies.py > > > anyway we are sure, that everything is up to date. > > > > > > Furthermore, this implies that in the actual jobs below we are also sure > that > > > dependencies are installed and up to date. There is, unfortunately, no > > > guarantee, that caching will succeed in time for subsequent jobs. Which > brings > > > me to: i should call "npm install" again in the jobs below, changed that. > > > > After your changes running the tests as well on Chrome, build.py isn't invoked > > by pretest anymore, but from within the test suite. The test suite, however, > > imports code from adblockpluscore. > > Gotya. I moved "python ensure_dependencies.py" to "pretest" accordingly - as > well as to the tests' local before_scripts. (IMHO moving > "ensure_dependencies.py" into the "pretest" script is rather a workaround for > local dependency management - and in CI i don't want to have that management > scattered all over the place / hidden in some package.json files. and FWIW, > since both "npm install" and "ensure_dependencies" are essentially noops, IMO > this doesn't hurt.) Now, we are running ensure_dependencies.py 3 times?! Once in the prepare stage, once in the before_script, and then again in "pretest" in packages.json. (Actually, it's 4 times, if you count build.py calling it again, but we cannot really avoid that). But running it in the prepare stage and before_script seems redundant to me. If it's just to make it more visible, I don't think this is justified, or is there any other benefit? As for "npm install", it's even worse, since it's quite expensive even if there is nothing to do. Every time you "npm install" it talks to a bunch of servers to check for updates. https://codereview.adblockplus.org/29867566/diff/29868555/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29868555/.gitlab-ci.yml#newc... .gitlab-ci.yml:56: tests:global: This isn't quite "testing" we are doing here, "lint" seems to be a a better name for this job.
i'll only comment to the one discussion this time, will tackle the other comments later. https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml#newc... .gitlab-ci.yml:25: - python ensure_dependencies.py On 2018/08/29 21:03:11, Sebastian Noack wrote: > On 2018/08/29 08:14:27, tlucas wrote: > > On 2018/08/28 12:50:57, Sebastian Noack wrote: > > > On 2018/08/28 12:21:59, tlucas wrote: > > > > On 2018/08/28 11:22:55, Sebastian Noack wrote: > > > > > I wonder whether this should rather go in pretest in package.json? The > > > > advantage > > > > > would be that we'd also make sure the dependencies are there and up to > > date > > > > when > > > > > running "npm test" manually during development. > > > > > > > > Since "python build.py" - which is called anyway in the current pretest > > script > > > / > > > > in the mocha tests, which are yet to be landed - calls > > ensure_dependencies.py > > > > anyway we are sure, that everything is up to date. > > > > > > > > Furthermore, this implies that in the actual jobs below we are also sure > > that > > > > dependencies are installed and up to date. There is, unfortunately, no > > > > guarantee, that caching will succeed in time for subsequent jobs. Which > > brings > > > > me to: i should call "npm install" again in the jobs below, changed that. > > > > > > After your changes running the tests as well on Chrome, build.py isn't > invoked > > > by pretest anymore, but from within the test suite. The test suite, however, > > > imports code from adblockpluscore. > > > > Gotya. I moved "python ensure_dependencies.py" to "pretest" accordingly - as > > well as to the tests' local before_scripts. (IMHO moving > > "ensure_dependencies.py" into the "pretest" script is rather a workaround for > > local dependency management - and in CI i don't want to have that management > > scattered all over the place / hidden in some package.json files. and FWIW, > > since both "npm install" and "ensure_dependencies" are essentially noops, IMO > > this doesn't hurt.) > > Now, we are running ensure_dependencies.py 3 times?! Once in the prepare stage, > once in the before_script, and then again in "pretest" in packages.json. > (Actually, it's 4 times, if you count build.py calling it again, but we cannot > really avoid that). But running it in the prepare stage and before_script seems > redundant to me. If it's just to make it more visible, I don't think this is > justified, or is there any other benefit? > > As for "npm install", it's even worse, since it's quite expensive even if there > is nothing to do. Every time you "npm install" it talks to a bunch of servers to > check for updates. Let me try to clarify the current state (chronologically): 1) The first invocation of "ensure_dependencies.py" (i.e. while preparing the cache) is merely an optimization step, though not necessary this speeds up all subsequent jobs immensely. Same applies to "npm install" 2) The second call to "ensure_dependencies.py" (and "npm install") from the jobs' local "before_script" is to actually ensure that all dependencies are installed. Mind that there is *no guarantee* (Winsley once encountered this IIRC), that the cache is actually available when a subsequent job starts running. 3) The third call (from within "pretest") is now a noop besides "npm install" in a CI environment (since we are now sure that everything is installed), but necessary for local development 4) The fourth call is triggered by invoking "build.py", this we can not change / don't want to change. So the steps worth looking at are 1) and 2), where 1) only is necessary while talking about parallel jobs with shared dependencies (which is true for us) and step 3). For step 3), we could run "build.py devenv -t <platform>" in the "all.js" script, *before* actually requiring the code from adblockpluscore (meaning we could get rid of the "pretest" section, when installing and requiring in order) For steps 1) and 2) -> The only real option would be not installing the dependencies in "prepare-dependencies" (but for every job individually) since there is no guarantee that caching succeeds. Meaning, if we keep step 1), we have to keep step 2) (to be sure). So i would definitely keep them, as they are. IF step 1) and the caching succeeds, we have a run time for installing the dependencies in step 2) (for each parallel running job!) at about 8% (even with multiple "npm install"s). If it doesn't succeed (haven't encountered that so far for adblockpluschrome) we have a worst case scenario of 100% run time in *each* subsequent job. So the aforementioned only option would leave us with 100% run time for each subsequent job, and mind that there are more to come (building / deploying)
https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml#newc... .gitlab-ci.yml:25: - python ensure_dependencies.py On 2018/08/30 11:05:58, tlucas wrote: > On 2018/08/29 21:03:11, Sebastian Noack wrote: > > On 2018/08/29 08:14:27, tlucas wrote: > > > On 2018/08/28 12:50:57, Sebastian Noack wrote: > > > > On 2018/08/28 12:21:59, tlucas wrote: > > > > > On 2018/08/28 11:22:55, Sebastian Noack wrote: > > > > > > I wonder whether this should rather go in pretest in package.json? The > > > > > advantage > > > > > > would be that we'd also make sure the dependencies are there and up to > > > date > > > > > when > > > > > > running "npm test" manually during development. > > > > > > > > > > Since "python build.py" - which is called anyway in the current pretest > > > script > > > > / > > > > > in the mocha tests, which are yet to be landed - calls > > > ensure_dependencies.py > > > > > anyway we are sure, that everything is up to date. > > > > > > > > > > Furthermore, this implies that in the actual jobs below we are also sure > > > that > > > > > dependencies are installed and up to date. There is, unfortunately, no > > > > > guarantee, that caching will succeed in time for subsequent jobs. Which > > > brings > > > > > me to: i should call "npm install" again in the jobs below, changed > that. > > > > > > > > After your changes running the tests as well on Chrome, build.py isn't > > invoked > > > > by pretest anymore, but from within the test suite. The test suite, > however, > > > > imports code from adblockpluscore. > > > > > > Gotya. I moved "python ensure_dependencies.py" to "pretest" accordingly - as > > > well as to the tests' local before_scripts. (IMHO moving > > > "ensure_dependencies.py" into the "pretest" script is rather a workaround > for > > > local dependency management - and in CI i don't want to have that management > > > scattered all over the place / hidden in some package.json files. and FWIW, > > > since both "npm install" and "ensure_dependencies" are essentially noops, > IMO > > > this doesn't hurt.) > > > > Now, we are running ensure_dependencies.py 3 times?! Once in the prepare > stage, > > once in the before_script, and then again in "pretest" in packages.json. > > (Actually, it's 4 times, if you count build.py calling it again, but we cannot > > really avoid that). But running it in the prepare stage and before_script > seems > > redundant to me. If it's just to make it more visible, I don't think this is > > justified, or is there any other benefit? > > > > As for "npm install", it's even worse, since it's quite expensive even if > there > > is nothing to do. Every time you "npm install" it talks to a bunch of servers > to > > check for updates. > > > Let me try to clarify the current state (chronologically): > > 1) The first invocation of "ensure_dependencies.py" (i.e. while preparing the > cache) is merely an optimization step, though not necessary this speeds up all > subsequent jobs immensely. Same applies to "npm install" > > 2) The second call to "ensure_dependencies.py" (and "npm install") from the > jobs' local "before_script" is to actually ensure that all dependencies are > installed. Mind that there is *no guarantee* (Winsley once encountered this > IIRC), that the cache is actually available when a subsequent job starts > running. > > 3) The third call (from within "pretest") is now a noop besides "npm install" in > a CI environment (since we are now sure that everything is installed), but > necessary for local development > > 4) The fourth call is triggered by invoking "build.py", this we can not change / > don't want to change. > > So the steps worth looking at are 1) and 2), where 1) only is necessary while > talking about parallel jobs with shared dependencies (which is true for us) and > step 3). > > For step 3), we could run "build.py devenv -t <platform>" in the "all.js" > script, *before* actually requiring the code from adblockpluscore (meaning we > could get rid of the "pretest" section, when installing and requiring in order) > > For steps 1) and 2) -> > The only real option would be not installing the dependencies in > "prepare-dependencies" (but for every job individually) since there is no > guarantee that caching succeeds. Meaning, if we keep step 1), we have to keep > step 2) (to be sure). > > So i would definitely keep them, as they are. > IF step 1) and the caching succeeds, we have a run time for installing the > dependencies in step 2) (for each parallel running job!) at about 8% (even with > multiple "npm install"s). If it doesn't succeed (haven't encountered that so far > for adblockpluschrome) we have a worst case scenario of 100% run time in *each* > subsequent job. > So the aforementioned only option would leave us with 100% run time for each > subsequent job, and mind that there are more to come (building / deploying) I agree that we should ideally have ensure_dependencies.py run before the jobs run, so that we only download the dependencies once. But I'm a little surprised that it's not guaranteed that the prepare stage completed / the cache exists before running the individual jobs. Any chance that this was a configuration issue, or bug in GitLab (that has been or will be fixed)? Can you actually reproduce that behavior at the current time? If not, I might rather remove the before_script step and revisit this once we run into it (if ever).
https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml#newc... .gitlab-ci.yml:25: - python ensure_dependencies.py On 2018/08/30 15:17:28, Sebastian Noack wrote: > On 2018/08/30 11:05:58, tlucas wrote: > > On 2018/08/29 21:03:11, Sebastian Noack wrote: > > > On 2018/08/29 08:14:27, tlucas wrote: > > > > On 2018/08/28 12:50:57, Sebastian Noack wrote: > > > > > On 2018/08/28 12:21:59, tlucas wrote: > > > > > > On 2018/08/28 11:22:55, Sebastian Noack wrote: > > > > > > > I wonder whether this should rather go in pretest in package.json? > The > > > > > > advantage > > > > > > > would be that we'd also make sure the dependencies are there and up > to > > > > date > > > > > > when > > > > > > > running "npm test" manually during development. > > > > > > > > > > > > Since "python build.py" - which is called anyway in the current > pretest > > > > script > > > > > / > > > > > > in the mocha tests, which are yet to be landed - calls > > > > ensure_dependencies.py > > > > > > anyway we are sure, that everything is up to date. > > > > > > > > > > > > Furthermore, this implies that in the actual jobs below we are also > sure > > > > that > > > > > > dependencies are installed and up to date. There is, unfortunately, no > > > > > > guarantee, that caching will succeed in time for subsequent jobs. > Which > > > > brings > > > > > > me to: i should call "npm install" again in the jobs below, changed > > that. > > > > > > > > > > After your changes running the tests as well on Chrome, build.py isn't > > > invoked > > > > > by pretest anymore, but from within the test suite. The test suite, > > however, > > > > > imports code from adblockpluscore. > > > > > > > > Gotya. I moved "python ensure_dependencies.py" to "pretest" accordingly - > as > > > > well as to the tests' local before_scripts. (IMHO moving > > > > "ensure_dependencies.py" into the "pretest" script is rather a workaround > > for > > > > local dependency management - and in CI i don't want to have that > management > > > > scattered all over the place / hidden in some package.json files. and > FWIW, > > > > since both "npm install" and "ensure_dependencies" are essentially noops, > > IMO > > > > this doesn't hurt.) > > > > > > Now, we are running ensure_dependencies.py 3 times?! Once in the prepare > > stage, > > > once in the before_script, and then again in "pretest" in packages.json. > > > (Actually, it's 4 times, if you count build.py calling it again, but we > cannot > > > really avoid that). But running it in the prepare stage and before_script > > seems > > > redundant to me. If it's just to make it more visible, I don't think this is > > > justified, or is there any other benefit? > > > > > > As for "npm install", it's even worse, since it's quite expensive even if > > there > > > is nothing to do. Every time you "npm install" it talks to a bunch of > servers > > to > > > check for updates. > > > > > > Let me try to clarify the current state (chronologically): > > > > 1) The first invocation of "ensure_dependencies.py" (i.e. while preparing the > > cache) is merely an optimization step, though not necessary this speeds up all > > subsequent jobs immensely. Same applies to "npm install" > > > > 2) The second call to "ensure_dependencies.py" (and "npm install") from the > > jobs' local "before_script" is to actually ensure that all dependencies are > > installed. Mind that there is *no guarantee* (Winsley once encountered this > > IIRC), that the cache is actually available when a subsequent job starts > > running. > > > > 3) The third call (from within "pretest") is now a noop besides "npm install" > in > > a CI environment (since we are now sure that everything is installed), but > > necessary for local development > > > > 4) The fourth call is triggered by invoking "build.py", this we can not change > / > > don't want to change. > > > > So the steps worth looking at are 1) and 2), where 1) only is necessary while > > talking about parallel jobs with shared dependencies (which is true for us) > and > > step 3). > > > > For step 3), we could run "build.py devenv -t <platform>" in the "all.js" > > script, *before* actually requiring the code from adblockpluscore (meaning we > > could get rid of the "pretest" section, when installing and requiring in > order) > > > > For steps 1) and 2) -> > > The only real option would be not installing the dependencies in > > "prepare-dependencies" (but for every job individually) since there is no > > guarantee that caching succeeds. Meaning, if we keep step 1), we have to keep > > step 2) (to be sure). > > > > So i would definitely keep them, as they are. > > IF step 1) and the caching succeeds, we have a run time for installing the > > dependencies in step 2) (for each parallel running job!) at about 8% (even > with > > multiple "npm install"s). If it doesn't succeed (haven't encountered that so > far > > for adblockpluschrome) we have a worst case scenario of 100% run time in > *each* > > subsequent job. > > So the aforementioned only option would leave us with 100% run time for each > > subsequent job, and mind that there are more to come (building / deploying) > > I agree that we should ideally have ensure_dependencies.py run before the jobs > run, so that we only download the dependencies once. But I'm a little surprised > that it's not guaranteed that the prepare stage completed / the cache exists > before running the individual jobs. Any chance that this was a configuration > issue, or bug in GitLab (that has been or will be fixed)? Can you actually > reproduce that behavior at the current time? If not, I might rather remove the > before_script step and revisit this once we run into it (if ever). As far as i understand (from their docs [1]) it's not a bug, but a limitation and not subject to change (at least i found no reference) but very unlikely for our setup (i.e. using the shell executor). I personally could not reproduce that behavior. I would advice against removing these steps from the jobs' before_script, but currently the automated workflow seems to work without them. (keep in mind though, when a developer decides to run a single job *only*, without the preparation - or a job from an earlier commit, where the cache has been deleted in the mean time - the job will always fail). If you insist on removing them, please do say so and i will change it - however, i would definitely advice against that move. [1] https://docs.gitlab.com/ee/ci/caching/#availability-of-the-cache https://codereview.adblockplus.org/29867566/diff/29868555/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29868555/.gitlab-ci.yml#newc... .gitlab-ci.yml:28: key: cache_$CI_COMMIT_SHA On 2018/08/29 08:39:47, wspee wrote: > IDEA: Have you thought about using a hash of the dependencies as a cache key > instead of the commit hash i.e. hash dependencies, add python dependencies to > the repository via a `dependencies.txt` or similar (like you had in previous > versions) and whatever else needs to be considered? I guess this would work - but IMHO not feasible right now https://codereview.adblockplus.org/29867566/diff/29868555/.gitlab-ci.yml#newc... .gitlab-ci.yml:56: tests:global: On 2018/08/29 21:03:11, Sebastian Noack wrote: > This isn't quite "testing" we are doing here, "lint" seems to be a a better name > for this job. Done. https://codereview.adblockplus.org/29867566/diff/29873555/package.json File package.json (right): https://codereview.adblockplus.org/29867566/diff/29873555/package.json#newcode12 package.json:12: "drivers." Result of rebasing.
https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29867567/.gitlab-ci.yml#newc... .gitlab-ci.yml:25: - python ensure_dependencies.py On 2018/09/03 09:54:45, tlucas wrote: > On 2018/08/30 15:17:28, Sebastian Noack wrote: > > On 2018/08/30 11:05:58, tlucas wrote: > > > On 2018/08/29 21:03:11, Sebastian Noack wrote: > > > > On 2018/08/29 08:14:27, tlucas wrote: > > > > > On 2018/08/28 12:50:57, Sebastian Noack wrote: > > > > > > On 2018/08/28 12:21:59, tlucas wrote: > > > > > > > On 2018/08/28 11:22:55, Sebastian Noack wrote: > > > > > > > > I wonder whether this should rather go in pretest in package.json? > > The > > > > > > > advantage > > > > > > > > would be that we'd also make sure the dependencies are there and > up > > to > > > > > date > > > > > > > when > > > > > > > > running "npm test" manually during development. > > > > > > > > > > > > > > Since "python build.py" - which is called anyway in the current > > pretest > > > > > script > > > > > > / > > > > > > > in the mocha tests, which are yet to be landed - calls > > > > > ensure_dependencies.py > > > > > > > anyway we are sure, that everything is up to date. > > > > > > > > > > > > > > Furthermore, this implies that in the actual jobs below we are also > > sure > > > > > that > > > > > > > dependencies are installed and up to date. There is, unfortunately, > no > > > > > > > guarantee, that caching will succeed in time for subsequent jobs. > > Which > > > > > brings > > > > > > > me to: i should call "npm install" again in the jobs below, changed > > > that. > > > > > > > > > > > > After your changes running the tests as well on Chrome, build.py isn't > > > > invoked > > > > > > by pretest anymore, but from within the test suite. The test suite, > > > however, > > > > > > imports code from adblockpluscore. > > > > > > > > > > Gotya. I moved "python ensure_dependencies.py" to "pretest" accordingly > - > > as > > > > > well as to the tests' local before_scripts. (IMHO moving > > > > > "ensure_dependencies.py" into the "pretest" script is rather a > workaround > > > for > > > > > local dependency management - and in CI i don't want to have that > > management > > > > > scattered all over the place / hidden in some package.json files. and > > FWIW, > > > > > since both "npm install" and "ensure_dependencies" are essentially > noops, > > > IMO > > > > > this doesn't hurt.) > > > > > > > > Now, we are running ensure_dependencies.py 3 times?! Once in the prepare > > > stage, > > > > once in the before_script, and then again in "pretest" in packages.json. > > > > (Actually, it's 4 times, if you count build.py calling it again, but we > > cannot > > > > really avoid that). But running it in the prepare stage and before_script > > > seems > > > > redundant to me. If it's just to make it more visible, I don't think this > is > > > > justified, or is there any other benefit? > > > > > > > > As for "npm install", it's even worse, since it's quite expensive even if > > > there > > > > is nothing to do. Every time you "npm install" it talks to a bunch of > > servers > > > to > > > > check for updates. > > > > > > > > > Let me try to clarify the current state (chronologically): > > > > > > 1) The first invocation of "ensure_dependencies.py" (i.e. while preparing > the > > > cache) is merely an optimization step, though not necessary this speeds up > all > > > subsequent jobs immensely. Same applies to "npm install" > > > > > > 2) The second call to "ensure_dependencies.py" (and "npm install") from the > > > jobs' local "before_script" is to actually ensure that all dependencies are > > > installed. Mind that there is *no guarantee* (Winsley once encountered this > > > IIRC), that the cache is actually available when a subsequent job starts > > > running. > > > > > > 3) The third call (from within "pretest") is now a noop besides "npm > install" > > in > > > a CI environment (since we are now sure that everything is installed), but > > > necessary for local development > > > > > > 4) The fourth call is triggered by invoking "build.py", this we can not > change > > / > > > don't want to change. > > > > > > So the steps worth looking at are 1) and 2), where 1) only is necessary > while > > > talking about parallel jobs with shared dependencies (which is true for us) > > and > > > step 3). > > > > > > For step 3), we could run "build.py devenv -t <platform>" in the "all.js" > > > script, *before* actually requiring the code from adblockpluscore (meaning > we > > > could get rid of the "pretest" section, when installing and requiring in > > order) > > > > > > For steps 1) and 2) -> > > > The only real option would be not installing the dependencies in > > > "prepare-dependencies" (but for every job individually) since there is no > > > guarantee that caching succeeds. Meaning, if we keep step 1), we have to > keep > > > step 2) (to be sure). > > > > > > So i would definitely keep them, as they are. > > > IF step 1) and the caching succeeds, we have a run time for installing the > > > dependencies in step 2) (for each parallel running job!) at about 8% (even > > with > > > multiple "npm install"s). If it doesn't succeed (haven't encountered that so > > far > > > for adblockpluschrome) we have a worst case scenario of 100% run time in > > *each* > > > subsequent job. > > > So the aforementioned only option would leave us with 100% run time for each > > > subsequent job, and mind that there are more to come (building / deploying) > > > > I agree that we should ideally have ensure_dependencies.py run before the jobs > > run, so that we only download the dependencies once. But I'm a little > surprised > > that it's not guaranteed that the prepare stage completed / the cache exists > > before running the individual jobs. Any chance that this was a configuration > > issue, or bug in GitLab (that has been or will be fixed)? Can you actually > > reproduce that behavior at the current time? If not, I might rather remove the > > before_script step and revisit this once we run into it (if ever). > > As far as i understand (from their docs [1]) it's not a bug, but a limitation > and not subject to change (at least i found no reference) but very unlikely for > our setup (i.e. using the shell executor). > I personally could not reproduce that behavior. I would advice against removing > these steps from the jobs' before_script, but currently the automated workflow > seems to work without them. (keep in mind though, when a developer decides to > run a single job *only*, without the preparation - or a job from an earlier > commit, where the cache has been deleted in the mean time - the job will always > fail). > If you insist on removing them, please do say so and i will change it - however, > i would definitely advice against that move. > > [1] https://docs.gitlab.com/ee/ci/caching/#availability-of-the-cache Fair enough. https://codereview.adblockplus.org/29867566/diff/29868555/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29868555/.gitlab-ci.yml#newc... .gitlab-ci.yml:28: key: cache_$CI_COMMIT_SHA On 2018/09/03 09:54:45, tlucas wrote: > On 2018/08/29 08:39:47, wspee wrote: > > IDEA: Have you thought about using a hash of the dependencies as a cache key > > instead of the commit hash i.e. hash dependencies, add python dependencies to > > the repository via a `dependencies.txt` or similar (like you had in previous > > versions) and whatever else needs to be considered? > > I guess this would work - but IMHO not feasible right now I kinda like that idea, but I'm not too familiar with the capabilities of GitLab CI configuration. However, if we could simply run a command here to derive the cache key, this seems to be as simple as: cat dependencies package.json | sha1sum | awk '{ print $1 }' https://codereview.adblockplus.org/29867566/diff/29873555/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29873555/.gitlab-ci.yml#newc... .gitlab-ci.yml:37: - python ensure_dependencies.py FWIW, the lint job below doesn't require ensure_dependencies.py. But not sure how much boilerplate a special case would cause? https://codereview.adblockplus.org/29867566/diff/29873555/.gitlab-ci.yml#newc... .gitlab-ci.yml:38: - npm install Perhaps we can check if node_modules already exists to avoid running "npm install" redundantly (in case the prepare stage took already care of it). Note that (unlike ensure_dependencies.py) "npm install" will always check for updates (which can take a while) even if package.json didn't change. https://codereview.adblockplus.org/29867566/diff/29873555/package.json File package.json (left): https://codereview.adblockplus.org/29867566/diff/29873555/package.json#oldcode28 package.json:28: "posttest": "npm run lint" Perhaps we can leave this in here, and then in the CI configuration run "npm run test" (instead of "npm test"). This should run just the "test" script (but not "pretest" and "posttest"). The advantage is that when running "npm test" during development, it would still lint, and the "pretest" step here is redundant anyway with the CI. What do you think?
Patch Set 5: * Actually cache all dependencies * Address comments https://codereview.adblockplus.org/29867566/diff/29868555/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29868555/.gitlab-ci.yml#newc... .gitlab-ci.yml:28: key: cache_$CI_COMMIT_SHA On 2018/09/03 19:37:35, Sebastian Noack wrote: > On 2018/09/03 09:54:45, tlucas wrote: > > On 2018/08/29 08:39:47, wspee wrote: > > > IDEA: Have you thought about using a hash of the dependencies as a cache key > > > instead of the commit hash i.e. hash dependencies, add python dependencies > to > > > the repository via a `dependencies.txt` or similar (like you had in previous > > > versions) and whatever else needs to be considered? > > > > I guess this would work - but IMHO not feasible right now > > I kinda like that idea, but I'm not too familiar with the capabilities of GitLab > CI configuration. However, if we could simply run a command here to derive the > cache key, this seems to be as simple as: > > cat dependencies package.json | sha1sum | awk '{ print $1 }' Turns out - unfortunately - that we can't use commands for the key, only variables (gitlab specific or predefined from the project settings, but also none exported from within the job) https://codereview.adblockplus.org/29867566/diff/29873555/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29873555/.gitlab-ci.yml#newc... .gitlab-ci.yml:37: - python ensure_dependencies.py On 2018/09/03 19:37:35, Sebastian Noack wrote: > FWIW, the lint job below doesn't require ensure_dependencies.py. But not sure > how much boilerplate a special case would cause? We'd need to copy this whole template into "lint" - IMHO not worth it :) https://codereview.adblockplus.org/29867566/diff/29873555/.gitlab-ci.yml#newc... .gitlab-ci.yml:38: - npm install On 2018/09/03 19:37:35, Sebastian Noack wrote: > Perhaps we can check if node_modules already exists to avoid running "npm > install" redundantly (in case the prepare stage took already care of it). Note > that (unlike ensure_dependencies.py) "npm install" will always check for updates > (which can take a while) even if package.json didn't change. I was thinking about "what if node_modules was created, but npm install was aborted for some reason?" - but then things would fail anyway. Done. https://codereview.adblockplus.org/29867566/diff/29873555/package.json File package.json (left): https://codereview.adblockplus.org/29867566/diff/29873555/package.json#oldcode28 package.json:28: "posttest": "npm run lint" On 2018/09/03 19:37:36, Sebastian Noack wrote: > Perhaps we can leave this in here, and then in the CI configuration run > "npm run test" (instead of "npm test"). This should run just the "test" script > (but not "pretest" and "posttest"). The advantage is that when running "npm > test" during development, it would still lint, and the "pretest" step here is > redundant anyway with the CI. What do you think? Unfortunately it doesn't matter, whether one invokes "npm run test" or "npm test" - pre-/posttest are always invoked (at least with node v8.11.4). I assume it's merely convenience, to be able to leave out "run" here. https://codereview.adblockplus.org/29867566/diff/29873578/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29873578/.gitlab-ci.yml#newc... .gitlab-ci.yml:20: .dependencies: I noticed that i only cache nodejs and adblockpluschrome's dependencies, but no python dependencies (as well as the .git folder) - hereby changed now. IIRC, unlike npm, pip doesn't check for updates and "mkdir -p" is also a noop.
FWIW, the latest Patch set with a real-world pipeline: https://gitlab.com/triluc/adblockpluschrome/pipelines/29315421
https://codereview.adblockplus.org/29867566/diff/29873555/package.json File package.json (left): https://codereview.adblockplus.org/29867566/diff/29873555/package.json#oldcode28 package.json:28: "posttest": "npm run lint" On 2018/09/03 21:35:54, tlucas wrote: > On 2018/09/03 19:37:36, Sebastian Noack wrote: > > Perhaps we can leave this in here, and then in the CI configuration run > > "npm run test" (instead of "npm test"). This should run just the "test" script > > (but not "pretest" and "posttest"). The advantage is that when running "npm > > test" during development, it would still lint, and the "pretest" step here is > > redundant anyway with the CI. What do you think? > > Unfortunately it doesn't matter, whether one invokes "npm run test" or "npm > test" - pre-/posttest are always invoked (at least with node v8.11.4). I assume > it's merely convenience, to be able to leave out "run" here. I guess we could add another script, e.g. test-only. https://codereview.adblockplus.org/29867566/diff/29873578/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29873578/.gitlab-ci.yml#newc... .gitlab-ci.yml:25: - if [ ! -d "node_modules" ]; then npm install; fi This could be done simpler: [ -d node_modules ] || npm install
Patch Set 6: * Reordered templates and jobs * Addressed comments https://codereview.adblockplus.org/29867566/diff/29873555/package.json File package.json (left): https://codereview.adblockplus.org/29867566/diff/29873555/package.json#oldcode28 package.json:28: "posttest": "npm run lint" On 2018/09/03 21:57:47, Sebastian Noack wrote: > On 2018/09/03 21:35:54, tlucas wrote: > > On 2018/09/03 19:37:36, Sebastian Noack wrote: > > > Perhaps we can leave this in here, and then in the CI configuration run > > > "npm run test" (instead of "npm test"). This should run just the "test" > script > > > (but not "pretest" and "posttest"). The advantage is that when running "npm > > > test" during development, it would still lint, and the "pretest" step here > is > > > redundant anyway with the CI. What do you think? > > > > Unfortunately it doesn't matter, whether one invokes "npm run test" or "npm > > test" - pre-/posttest are always invoked (at least with node v8.11.4). I > assume > > it's merely convenience, to be able to leave out "run" here. > > I guess we could add another script, e.g. test-only. Right! Done. https://codereview.adblockplus.org/29867566/diff/29873578/.gitlab-ci.yml File .gitlab-ci.yml (right): https://codereview.adblockplus.org/29867566/diff/29873578/.gitlab-ci.yml#newc... .gitlab-ci.yml:25: - if [ ! -d "node_modules" ]; then npm install; fi On 2018/09/03 21:57:47, Sebastian Noack wrote: > This could be done simpler: > > [ -d node_modules ] || npm install Done.
LGTM |