|
|
Created:
Feb. 5, 2016, 5:09 p.m. by Felix Dahlke Modified:
Feb. 9, 2016, 1:36 p.m. Visibility:
Public. |
DescriptionIssue 1299 - Generate docs outside the devbuild build process
Patch Set 1 #
Total comments: 16
Patch Set 2 : Address comments, fix issues #
Total comments: 16
Patch Set 3 : Address comments #Patch Set 4 : Use hg clone/pull instead of hg archive #
Total comments: 2
Patch Set 5 : Separate pull and update, specify revision to include #
MessagesTotal messages: 15
https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... File sitescripts/docs/bin/generate_docs.py (right): https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:2: # coding: utf-8 Not that a coding declaration is necessary here. Moreover, according to PEP-8 source files (in Python 2) should use the ASCII anyway. https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:25: config = dict(get_config().items("docs")) I'd prefer if you use the ConfigParser API consistently rather than turning it into a dictionary for no obvious reason. https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:34: if field_name not in ["repository", "target_directory", "command"]: Nit: A set seems more appropriate here. In case you didn't know yet there are also set literals: {"repository", "target_directory", "command"} https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:36: projects.setdefault(project_name, {}) There is a reason that setdefault() returns the respective value: projects.setdefault(project_name, {})[field_name] = value https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:41: remote_id = subprocess.check_output(["hg", "id", "-i", repository_url]) Nit: Please use long option names in scripts, that people not familiar with the invoked program have a better idea what it does. (The shorthand syntax mostly exist for convenience on the shell.) https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:42: id_path = sources_dir.rstrip("/") + ".id" Please don't hard-code file separators, use.ospath.sep. https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:43: if os.path.exists(id_path): It's better to handle the corresponding IOError, rather than performing checks in advance. That not only keeps the kernel/IO-roundtrip small, but also avoids race conditions. https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:44: with open(id_path, "r") as id_file: Please always add "b" to the file mode, for consistent behavior on Windows.
Thanks, comments addressed. I found two issues I addressed as well, see my comments. https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... File sitescripts/docs/bin/generate_docs.py (right): https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:2: # coding: utf-8 On 2016/02/05 18:02:32, Sebastian Noack wrote: > Not that a coding declaration is necessary here. Moreover, according to PEP-8 > source files (in Python 2) should use the ASCII anyway. Wow true, either that part changed or we misunderstood it all along. We have this coding declaration in 95% of the source files in sitescripts. I'll leave it out here then, but we should consider changing that across the bank. https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:25: config = dict(get_config().items("docs")) On 2016/02/05 18:02:31, Sebastian Noack wrote: > I'd prefer if you use the ConfigParser API consistently rather than turning it > into a dictionary for no obvious reason. Done. https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:34: if field_name not in ["repository", "target_directory", "command"]: On 2016/02/05 18:02:32, Sebastian Noack wrote: > Nit: A set seems more appropriate here. In case you didn't know yet there are > also set literals: > > {"repository", "target_directory", "command"} No I didn't, nice. Done. https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:36: projects.setdefault(project_name, {}) On 2016/02/05 18:02:31, Sebastian Noack wrote: > There is a reason that setdefault() returns the respective value: > > projects.setdefault(project_name, {})[field_name] = value Done. https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:41: remote_id = subprocess.check_output(["hg", "id", "-i", repository_url]) On 2016/02/05 18:02:32, Sebastian Noack wrote: > Nit: Please use long option names in scripts, that people not familiar with the > invoked program have a better idea what it does. (The shorthand syntax mostly > exist for convenience on the shell.) Done. https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:42: id_path = sources_dir.rstrip("/") + ".id" On 2016/02/05 18:02:31, Sebastian Noack wrote: > Please don't hard-code file separators, use.ospath.sep. Done. https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:43: if os.path.exists(id_path): On 2016/02/05 18:02:31, Sebastian Noack wrote: > It's better to handle the corresponding IOError, rather than performing checks > in advance. That not only keeps the kernel/IO-roundtrip small, but also avoids > race conditions. Fair enough, I've gotten rid of the os.path.exists() I had an idea for how to get rid of. Ideas welcome for the remaining ones. https://codereview.adblockplus.org/29335805/diff/29335806/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:44: with open(id_path, "r") as id_file: On 2016/02/05 18:02:32, Sebastian Noack wrote: > Please always add "b" to the file mode, for consistent behavior on Windows. Done. https://codereview.adblockplus.org/29335805/diff/29335820/.sitescripts.example File .sitescripts.example (right): https://codereview.adblockplus.org/29335805/diff/29335820/.sitescripts.exampl... .sitescripts.example:210: adblockplus_command=./build.py -t gecko docs {output_dir} &>/dev/null Unless I do this, the ensure_dependencies.py output shows up. I'm wondering if we shouldn't just always swallow all output from these commands... https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... File sitescripts/docs/bin/generate_docs.py (right): https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:63: # Because of how ensure_dependencies.py works, we have to create fake .hg I discovered that ensure_dependencies.py, being required for `./build.py docs`, fails if it doesn't find either a `.hg` or `.git` directory. Since we're doing a shallow clone, there is no such thing. It's easy enough to hack around this, but I'm not convinced that's the right way to do it. I see two alternatives: 1. We move (back) to using clone and pull here rather than hg archive and the remote ID check. (Wladimir objected to that in the last patch I wrote for this.) 2. We change ensure_dependencies.py to go for Mercurial by default, but that won't help us here, it'll take a while before that change has propagated into adblockplus, adblockpluschrome, adblockpluscore and libadblockplus. I'm somewhat leaning towards #2. We could create a goodfirstbug to change this in ensure_dependencies.py and refer to this piece of code from that review, so it gets removed later.
https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... File sitescripts/docs/bin/generate_docs.py (right): https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:63: # Because of how ensure_dependencies.py works, we have to create fake .hg On 2016/02/05 20:47:59, Felix Dahlke wrote: > I discovered that ensure_dependencies.py, being required for `./build.py docs`, > fails if it doesn't find either a `.hg` or `.git` directory. Since we're doing a > shallow clone, there is no such thing. It's easy enough to hack around this, but > I'm not convinced that's the right way to do it. I see two alternatives: > > 1. We move (back) to using clone and pull here rather than hg archive and the > remote ID check. (Wladimir objected to that in the last patch I wrote for this.) > > 2. We change ensure_dependencies.py to go for Mercurial by default, but that > won't help us here, it'll take a while before that change has propagated into > adblockplus, adblockpluschrome, adblockpluscore and libadblockplus. > > I'm somewhat leaning towards #2. We could create a goodfirstbug to change this > in ensure_dependencies.py and refer to this piece of code from that review, so > it gets removed later. Moreover, build.py requires buildtools, and if you don't clone the repo and set the CWD accordingly we rely on the buildtools version from the server anyway. So how about supporting buildtools natively here as before but alternatively to a custom command (for libadblockplus). The config therefore could look like: adblockpluschrome_jsdoc = true ... libadblockplus_command = make docs This would also eliminate the boilerplate in adblockpluscore and the /dev/null hack in the config here.
https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... File sitescripts/docs/bin/generate_docs.py (right): https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:63: # Because of how ensure_dependencies.py works, we have to create fake .hg On 2016/02/05 22:46:18, Sebastian Noack wrote: > Moreover, build.py requires buildtools, and if you don't clone the > repo and set the CWD accordingly we rely on the buildtools version > from the server anyway. Since I invoke the repository's build.py here, that's exactly what happens. > So how about supporting buildtools natively here as before but > alternatively to a custom command (for libadblockplus). The config > therefore could look like: > > adblockpluschrome_jsdoc = true > ... > libadblockplus_command = make docs Hm... I'm not sure. It would simplify things, and we had it this way before. On the other hand, it seems wrong because sitescripts would always use the latest version of buildtools even if, for some reason, some project needs an older version. Wladimir, thoughts?
https://codereview.adblockplus.org/29335805/diff/29335820/.sitescripts.example File .sitescripts.example (right): https://codereview.adblockplus.org/29335805/diff/29335820/.sitescripts.exampl... .sitescripts.example:210: adblockplus_command=./build.py -t gecko docs {output_dir} &>/dev/null On 2016/02/05 20:47:59, Felix Dahlke wrote: > Unless I do this, the ensure_dependencies.py output shows up. I'm wondering if > we shouldn't just always swallow all output from these commands... That's probably a bad idea. If something is wrong on the server, the corresponding errors should be logged and sent via cron mails. Otherwise we wouldn't know when things are broken. So if we keep calling build.py here, we really should only filter out the INFO log messages. One way to do that, would be introducing an environment variable or an option that is passed through from build.py, to set the log level. https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... File sitescripts/docs/bin/generate_docs.py (right): https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:49: except OSError: shutil.rmtree() has an ingnore_errors argument. https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:63: # Because of how ensure_dependencies.py works, we have to create fake .hg On 2016/02/06 06:42:55, Felix Dahlke wrote: > On 2016/02/05 22:46:18, Sebastian Noack wrote: > > Moreover, build.py requires buildtools, and if you don't clone the > > repo and set the CWD accordingly we rely on the buildtools version > > from the server anyway. > > Since I invoke the repository's build.py here, that's exactly what happens. Yeah, I missed something. You are right. That should be fine. > > So how about supporting buildtools natively here as before but > > alternatively to a custom command (for libadblockplus). The config > > therefore could look like: > > > > adblockpluschrome_jsdoc = true > > ... > > libadblockplus_command = make docs > > Hm... I'm not sure. It would simplify things, and we had it this way before. On > the other hand, it seems wrong because sitescripts would always use the latest > version of buildtools even if, for some reason, some project needs an older > version. Note that this currently also the case for the nightlies generation. But I wouldn't insist. Changing ensure_dependencyies.py to use mercurial if running outside of an repo sounds good too. https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:76: old_target_dir = target_dir.rstrip(os.path.sep) + ".old" Generating backups seems to be unnecessary as the content is auto-generated and can be reproduced if necessary anyway. https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:107: if __name__ == "__main__": How about calling get_config() here and passing the config to read_projects() and generate_docs below?
New patch set is up. https://codereview.adblockplus.org/29335805/diff/29335820/.sitescripts.example File .sitescripts.example (right): https://codereview.adblockplus.org/29335805/diff/29335820/.sitescripts.exampl... .sitescripts.example:210: adblockplus_command=./build.py -t gecko docs {output_dir} &>/dev/null On 2016/02/06 09:49:50, Sebastian Noack wrote: > On 2016/02/05 20:47:59, Felix Dahlke wrote: > > Unless I do this, the ensure_dependencies.py output shows up. I'm wondering if > > we shouldn't just always swallow all output from these commands... > > That's probably a bad idea. If something is wrong on the server, the > corresponding errors should be logged and sent via cron mails. Otherwise we > wouldn't know when things are broken. > > So if we keep calling build.py here, we really should only filter out the INFO > log messages. One way to do that, would be introducing an environment variable > or an option that is passed through from build.py, to set the log level. Yes you're right, that's much better. https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... File sitescripts/docs/bin/generate_docs.py (right): https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:49: except OSError: On 2016/02/06 09:49:50, Sebastian Noack wrote: > shutil.rmtree() has an ingnore_errors argument. Awesome, I was very unhappy with all this boiler plate. Now it's just os.makedirs that annoys me... https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:63: # Because of how ensure_dependencies.py works, we have to create fake .hg On 2016/02/06 09:49:50, Sebastian Noack wrote: > On 2016/02/06 06:42:55, Felix Dahlke wrote: > > On 2016/02/05 22:46:18, Sebastian Noack wrote: > > > Moreover, build.py requires buildtools, and if you don't clone the > > > repo and set the CWD accordingly we rely on the buildtools version > > > from the server anyway. > > > > Since I invoke the repository's build.py here, that's exactly what happens. > > Yeah, I missed something. You are right. That should be fine. > > > > So how about supporting buildtools natively here as before but > > > alternatively to a custom command (for libadblockplus). The config > > > therefore could look like: > > > > > > adblockpluschrome_jsdoc = true > > > ... > > > libadblockplus_command = make docs > > > > Hm... I'm not sure. It would simplify things, and we had it this way before. > On > > the other hand, it seems wrong because sitescripts would always use the latest > > version of buildtools even if, for some reason, some project needs an older > > version. > > Note that this currently also the case for the nightlies generation. But I > wouldn't insist. Changing ensure_dependencyies.py to use mercurial if running > outside of an repo sounds good too. Done. https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:76: old_target_dir = target_dir.rstrip(os.path.sep) + ".old" On 2016/02/06 09:49:50, Sebastian Noack wrote: > Generating backups seems to be unnecessary as the content is auto-generated and > can be reproduced if necessary anyway. This isn't really a backup, it's just so that replacing the old with the new docs is faster, the *.old directory is removed after the rename. That was a suggestion from Wladimir from the last time I tackled this. https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:107: if __name__ == "__main__": On 2016/02/06 09:49:51, Sebastian Noack wrote: > How about calling get_config() here and passing the config to read_projects() > and generate_docs below? Done.
https://codereview.adblockplus.org/29335805/diff/29335820/.sitescripts.example File .sitescripts.example (right): https://codereview.adblockplus.org/29335805/diff/29335820/.sitescripts.exampl... .sitescripts.example:210: adblockplus_command=./build.py -t gecko docs {output_dir} &>/dev/null On 2016/02/06 10:46:39, Felix Dahlke wrote: > On 2016/02/06 09:49:50, Sebastian Noack wrote: > > On 2016/02/05 20:47:59, Felix Dahlke wrote: > > > Unless I do this, the ensure_dependencies.py output shows up. I'm wondering > if > > > we shouldn't just always swallow all output from these commands... > > > > That's probably a bad idea. If something is wrong on the server, the > > corresponding errors should be logged and sent via cron mails. Otherwise we > > wouldn't know when things are broken. > > > > So if we keep calling build.py here, we really should only filter out the INFO > > log messages. One way to do that, would be introducing an environment variable > > or an option that is passed through from build.py, to set the log level. > > Yes you're right, that's much better. LGTM with this being resolved. https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... File sitescripts/docs/bin/generate_docs.py (right): https://codereview.adblockplus.org/29335805/diff/29335820/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:76: old_target_dir = target_dir.rstrip(os.path.sep) + ".old" On 2016/02/06 10:46:39, Felix Dahlke wrote: > On 2016/02/06 09:49:50, Sebastian Noack wrote: > > Generating backups seems to be unnecessary as the content is auto-generated > and > > can be reproduced if necessary anyway. > > This isn't really a backup, it's just so that replacing the old with the new > docs is faster, the *.old directory is removed after the rename. That was a > suggestion from Wladimir from the last time I tackled this. Acknowledged.
Uploaded a new patch set that uses hg clone/pull instead of hg archive. Benefits: 1. We can run this on a server other than server16 (I think it belongs on web2, the one that hosts adblockplus.org) 2. The code is a lot simpler this way Downside is that the initial clone causes more network traffic, and that this uses more disk space. But neither seems like a bottleneck here.
Didnt you say that palant objected to clone the repo here?
On 2016/02/08 10:04:14, Sebastian Noack wrote: > Didnt you say that palant objected to clone the repo here? Yes, but I presume he didn't have the implications in mind at the time, we shouldn't run anything on server16 at this point, and I also don't like how complex the code ended up. He's in CC here. Wladimir, thoughts?
I agree that we shouldn't be running this on server16, particularly not with docs for libadblockplus and others being generated. The question is however whether we really want the repository sync to be performed by this script or rather by our infrastructure like we do it elsewhere. I can see advantages and disadvantages for both, slightly leaning towards leaving the current approach however. https://codereview.adblockplus.org/29335805/diff/29335968/sitescripts/docs/bi... File sitescripts/docs/bin/generate_docs.py (right): https://codereview.adblockplus.org/29335805/diff/29335968/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:39: "--repository", sources_dir]) You don't really want to call |hg pull --update|, it will update to repository tip and only if it is a linear update. You should rather call |hg update --rev master| after pulling explicitly. You can also pull with --rev master parameter, this might avoid pulling unnecessary branches.
New patch set is up. https://codereview.adblockplus.org/29335805/diff/29335968/sitescripts/docs/bi... File sitescripts/docs/bin/generate_docs.py (right): https://codereview.adblockplus.org/29335805/diff/29335968/sitescripts/docs/bi... sitescripts/docs/bin/generate_docs.py:39: "--repository", sources_dir]) On 2016/02/09 10:44:28, Wladimir Palant wrote: > You don't really want to call |hg pull --update|, it will update to repository > tip and only if it is a linear update. You should rather call |hg update --rev > master| after pulling explicitly. You can also pull with --rev master parameter, > this might avoid pulling unnecessary branches. Done.
LGTM
LGTM |