|
|
Created:
Sept. 25, 2017, 7:12 p.m. by Vasily Kuznetsov Modified:
Oct. 4, 2017, 9:43 a.m. CC:
wspee Visibility:
Public. |
DescriptionIssue 5336 - Allow additional include, page, and template paths using CMS
repository: https://hg.adblockplus.org/cms
base revision: fb78657d24c3
Patch Set 1 #
Total comments: 9
Patch Set 2 : Address comments on PS1 #Patch Set 3 : Address comments on PS2, improve docstring of create_source #
Total comments: 10
Patch Set 4 : Address comments on PS3 #
Total comments: 2
Patch Set 5 : Rename 'static' parameter to 'cached' #
MessagesTotal messages: 19
Hi Jon, Sebastian and Julian! Here's the implementation of the "additional paths" feature. I created new source class that can combine the contents from multiple other sources and consolidated the source initialization code inside of `sources.py`. In the process I moved the code that establishes caching in `MercurialSource` from `generate_static_pages.py` to `sources.py`. My preference would be to encapsulate caching even more, so that flushing of the cache would be handled by the source itself and not by `generate_static_pages.py` talking to the caching wrapper. Let me know if you agree and I will do it. Cheers, Vasily https://codereview.adblockplus.org/29555839/diff/29555840/cms/bin/generate_st... File cms/bin/generate_static_pages.py (right): https://codereview.adblockplus.org/29555839/diff/29555840/cms/bin/generate_st... cms/bin/generate_static_pages.py:91: source.resolve_link.cache_clear() This function is renamed while moving the `memoize` decorator to be API-compatible with the upcoming `functools.lru_cache`.
https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py#newc... cms/sources.py:335: for base in self._bases: Nit: You could use any() here as well: return any(base.has_file(filename) for base in self._bases) https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py#newc... cms/sources.py:350: return sorted(files) Nit: This could be written as a set literal one-liner: return {f for base in self._bases for f in base.list_files(subdir)} Any reason why you sort? It seems the other sources also yield results in more or less arbitrary order. https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py#newc... cms/sources.py:366: def create_source(path, static=False, revision='default'): I would rather implement this in Source.__new__, so that the calling code would basically remain the same, i.e: FileSource(args.path) MercurialSource(repo, revision) vs: create_source(args.path) create_source(repo, static=True, revision=revision) This not only preserves a stable API (which probably isn't too important), but that way you also don't need to introduce a boolean flag which you then map to hard-code types, but you can just delegate to cls (first arg of __new__).
Hi Sebastian, Thank you for the comments. I've implemented the first two and replied to the third (see below). Cheers, Vasily https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py#newc... cms/sources.py:335: for base in self._bases: On 2017/09/27 02:47:53, Sebastian Noack wrote: > Nit: You could use any() here as well: > > return any(base.has_file(filename) for base in self._bases) Done. https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py#newc... cms/sources.py:350: return sorted(files) On 2017/09/27 02:47:53, Sebastian Noack wrote: > Nit: This could be written as a set literal one-liner: > > return {f for base in self._bases for f in base.list_files(subdir)} > > Any reason why you sort? It seems the other sources also yield results in more > or less arbitrary order. Done. Thanks for this suggestion. Having thought about it again, there's no particularly good reason to sort. Sorting makes files in the same subdirectory appear together, even if they are coming from different base sources, but this doesn't seem to be used anywhere, so it's not really important. What might be important is the type of the return value. Sorted returns a list while your proposal returns a set. I wrapped the set in a list for now, to be on the safe side. Let me know if you think this is not necessary. https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py#newc... cms/sources.py:366: def create_source(path, static=False, revision='default'): On 2017/09/27 02:47:53, Sebastian Noack wrote: > I would rather implement this in Source.__new__, so that the calling code would > basically remain the same, i.e: > > FileSource(args.path) > MercurialSource(repo, revision) > > vs: > > create_source(args.path) > create_source(repo, static=True, revision=revision) > > This not only preserves a stable API (which probably isn't too important), but > that way you also don't need to introduce a boolean flag which you then map to > hard-code types, but you can just delegate to cls (first arg of __new__). Here I prefer my implementation for the following reasons: 1. Overriding __new__ is a somewhat advanced technique that should not be used without a good reason. I don't think we have a good reason in this case. 2. Constructors of `FileSource` and `MercurialSource` would end up sometimes returning an object of a different class from the type the client is trying to construct (e.g. `MultiSource` instead of `MercurialSource`). This is not the end of the world, but confusing and better avoided. 3. The design of how `generate_static_files.py` and `test_server.py` are constructing the source could be improved and I'm using the opportunity to start doing it. Ideally, the code that instantiates and configures the sources would be the same in dynamic and static cases: this ensures consistency and makes it easier to change and test it. There was not much code in the dynamic case, only `FileSource` instantiation, but with the change, we now make it clear that what gets created is not necessarily a `FileSource`, it's just some subclass of `Source`. Expectations are more clear = good! In the static case there was caching code, that I'm not quite comfortable with because it messes too much with the class internals -- now it's moved to `create_source` function, closer to the source classes so the dependency is more obvious. It could be further improved by introducing a `CachingSource`: a proxy source that completely encapsulates the caching behavior, but let's hold our horses for now and just be content that it's become a trivial change inside `sources.py`. 4. `static` flag is a meaningful part of the API. It tells the source constructing machinery that we're going to be doing a one-off static generation and not serving pages on demand, where the changes on the disk should be observed. Right now it has the effect of enabling caching in `create_source` and choosing `MercurialSource` instead of `FileSource` (which I'm not too happy about, but for now we keep the behavior) but it might become something else later. The meaning of `static` parameter is that the client is not interested in tracking changes on the disk after the source is created and how that's ensured is the internal business of `create_source`.
https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py#newc... cms/sources.py:350: return sorted(files) On 2017/09/27 11:34:39, Vasily Kuznetsov wrote: > What might be important is the type of the return value. Sorted returns a list > while your proposal returns a set. I wrapped the set in a list for now, to be on > the safe side. Let me know if you think this is not necessary. The type returned here doesn't seem to be important either as FileSource gives you a list, while MercurialSource gives you a generator. So I would just return the set here.
I removed the conversion of set to list in `MultiSource.list_files` following Sebastian's suggestion and improved the docstring of `create_source` to make the meaning of `static` parameter more clear. https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py#newc... cms/sources.py:350: return sorted(files) On 2017/09/27 15:55:22, Sebastian Noack wrote: > On 2017/09/27 11:34:39, Vasily Kuznetsov wrote: > > What might be important is the type of the return value. Sorted returns a list > > while your proposal returns a set. I wrapped the set in a list for now, to be > on > > the safe side. Let me know if you think this is not necessary. > > The type returned here doesn't seem to be important either as FileSource gives > you a list, while MercurialSource gives you a generator. So I would just return > the set here. Fair enough. Done.
https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc... cms/sources.py:396: create_source(os.path.join(path, p)) This seems to be equivalent to FileSource(os.path.join(path, p)), hence the recursion would be unnecessary and just makes the code harder to follow. Also I wonder whether this would even work in practice. So on the server where MercurialSource is used since the repository isn't checked out these paths wouldn't exist anyway, or do I miss something?
https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc... cms/sources.py:360: def create_source(path, static=False, revision='default'): The default revision should be 'master', not 'default'. https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc... cms/sources.py:382: if static: Should this flag really determine which type of source we choose? I'd expect this to be dependent on the `revision` parameter - the default of this parameter should be None, indicating a FileSource. And if some value is passed in, we should use MercurialSource. https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc... cms/sources.py:391: except ConfigParser.NoSectionError: What about NoOptionError? https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc... cms/sources.py:396: create_source(os.path.join(path, p)) On 2017/09/28 20:48:21, Sebastian Noack wrote: > This seems to be equivalent to FileSource(os.path.join(path, p)), hence the > recursion would be unnecessary and just makes the code harder to follow. Not quite equivalent. The directory we link to might have additional paths of its own, so we might get a MultiSource instance here again. Whether such nested dependencies make a lot of sense is questionable, but we should support them for correctness. > Also I wonder whether this would even work in practice. So on the server where > MercurialSource is used since the repository isn't checked out these paths > wouldn't exist anyway, or do I miss something? It would work in practice, but in a rather suboptimal way. On the server we would use MercurialSource for the base source but FileSource for its dependencies. So the server would have to update the dependencies to the right revision, exactly the situation that MercurialSource is meant to avoid. Something else is a bigger concern IMHO. The source repository is no longer self-contained with this change, so whenever you want to test you would have to check out its dependencies (what revisions?) to the same location as what the server uses. Testing will require a rather messy setup now. We've been there with our source code, and it isn't really manageable. That's why we created ensure_dependencies.py. While reusing ensure_dependencies.py here might be complicated, we might want to use a simplified version of the logic. Meaning not additional_paths but rather inherit_from setting indicating repository location and revision. These repositories should be automatically cloned (with --noupdate flag) as subdirectories within a dependencies/ directory or pulled as necessary, we can use MercurialSource for them then - both for server and test environment.
Hi Wladimir, Thank you for the comments. I have addressed all except for the last one. That one will need some discussion to decide how to proceed. Julian and Winsley, Could you please comment on your preferences as to whether the CMS should rely on Mercurial to put the dependencies in place (in this case we would depend on repository url + revision instead of a path like now)? We had this discussion before, in the ticket, and I have referenced it here. See the bottom of this message for more detail. Cheers, Vasily https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc... cms/sources.py:360: def create_source(path, static=False, revision='default'): On 2017/09/29 08:50:44, Wladimir Palant wrote: > The default revision should be 'master', not 'default'. Changed to `None`, following your next comment. Done https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc... cms/sources.py:382: if static: On 2017/09/29 08:50:44, Wladimir Palant wrote: > Should this flag really determine which type of source we choose? I'd expect > this to be dependent on the `revision` parameter - the default of this parameter > should be None, indicating a FileSource. And if some value is passed in, we > should use MercurialSource. Yeah, you're right. `static` parameter should control the caching and `revision` parameter (or its absence) should select the source type. Done. https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc... cms/sources.py:391: except ConfigParser.NoSectionError: On 2017/09/29 08:50:44, Wladimir Palant wrote: > What about NoOptionError? Thank you. Done. https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc... cms/sources.py:396: create_source(os.path.join(path, p)) On 2017/09/29 08:50:44, Wladimir Palant wrote: > On 2017/09/28 20:48:21, Sebastian Noack wrote: > > This seems to be equivalent to FileSource(os.path.join(path, p)), hence the > > recursion would be unnecessary and just makes the code harder to follow. > > Not quite equivalent. The directory we link to might have additional paths of > its own, so we might get a MultiSource instance here again. Whether such nested > dependencies make a lot of sense is questionable, but we should support them for > correctness. Yes, nested dependencies are the reason why I made `create_source` recursive. > > Also I wonder whether this would even work in practice. So on the server where > > MercurialSource is used since the repository isn't checked out these paths > > wouldn't exist anyway, or do I miss something? > > It would work in practice, but in a rather suboptimal way. On the server we > would use MercurialSource for the base source but FileSource for its > dependencies. So the server would have to update the dependencies to the right > revision, exactly the situation that MercurialSource is meant to avoid. > > Something else is a bigger concern IMHO. The source repository is no longer > self-contained with this change, so whenever you want to test you would have to > check out its dependencies (what revisions?) to the same location as what the > server uses. Testing will require a rather messy setup now. We've been there > with our source code, and it isn't really manageable. That's why we created > ensure_dependencies.py. > > While reusing ensure_dependencies.py here might be complicated, we might want to > use a simplified version of the logic. Meaning not additional_paths but rather > inherit_from setting indicating repository location and revision. These > repositories should be automatically cloned (with --noupdate flag) as > subdirectories within a dependencies/ directory or pulled as necessary, we can > use MercurialSource for them then - both for server and test environment. This was one of the options in the original discussion in the ticket (see https://issues.adblockplus.org/ticket/5336#comment:2). After all the deliberations, we (Jon, Julian, Winsley, Matze, perhaps others from Website infra team) decided to go for a solution that gives more control and responsibility to the website developers and doesn't increase our dependency on Mercurial. I like your proposal (or my second option from the comment linked above) for that it makes CMS more aware of the dependencies and more able to ensure that they are put in place consistently and have right versions. What I don't like about it is that it makes CMS and the websites more dependent on Mercurial. In my experience the use of `MercurialSource` has only brought us awkwardness in testing (with potential missed issues) and problems in production (albeit, because people had wrong expectations about CMS behavior). My preference would be to have a simpler CMS that is just a site generator that knows nothing about version control and use standard tools to take care of dependency management. However, my preference is not as important as what website developers think and how they would prefer to work. So far, my impression was that they share my preferences. Perhaps Julian and Winsley, who are also CC'd on this review can comment on this. If it matters, from where we are now, implementing Wladimir's proposal in addition to, or instead of the current approach, would probably take me about a day of work.
> Could you please comment on your preferences as to whether the CMS should rely > on Mercurial to put the dependencies in place (in this case we would depend on > repository url + revision instead of a path like now)? We had this discussion > before, in the ticket, and I have referenced it here. See the bottom of this > message for more detail. I think that we should rely on npm + package.json to manage dependencies. We will use npm and gulp to build websites anyway (gulp will call cms) in the near (I hope) future.
https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc... cms/sources.py:396: create_source(os.path.join(path, p)) On 2017/09/29 10:49:09, Vasily Kuznetsov wrote: > On 2017/09/29 08:50:44, Wladimir Palant wrote: > > On 2017/09/28 20:48:21, Sebastian Noack wrote: > > > This seems to be equivalent to FileSource(os.path.join(path, p)), hence the > > > recursion would be unnecessary and just makes the code harder to follow. > > > > Not quite equivalent. The directory we link to might have additional paths of > > its own, so we might get a MultiSource instance here again. Whether such > nested > > dependencies make a lot of sense is questionable, but we should support them > for > > correctness. > > Yes, nested dependencies are the reason why I made `create_source` recursive. > > > > Also I wonder whether this would even work in practice. So on the server > where > > > MercurialSource is used since the repository isn't checked out these paths > > > wouldn't exist anyway, or do I miss something? > > > > It would work in practice, but in a rather suboptimal way. On the server we > > would use MercurialSource for the base source but FileSource for its > > dependencies. So the server would have to update the dependencies to the right > > revision, exactly the situation that MercurialSource is meant to avoid. > > > > Something else is a bigger concern IMHO. The source repository is no longer > > self-contained with this change, so whenever you want to test you would have > to > > check out its dependencies (what revisions?) to the same location as what the > > server uses. Testing will require a rather messy setup now. We've been there > > with our source code, and it isn't really manageable. That's why we created > > ensure_dependencies.py. > > > > While reusing ensure_dependencies.py here might be complicated, we might want > to > > use a simplified version of the logic. Meaning not additional_paths but rather > > inherit_from setting indicating repository location and revision. These > > repositories should be automatically cloned (with --noupdate flag) as > > subdirectories within a dependencies/ directory or pulled as necessary, we can > > use MercurialSource for them then - both for server and test environment. > > This was one of the options in the original discussion in the ticket (see > https://issues.adblockplus.org/ticket/5336#comment:2). After all the > deliberations, we (Jon, Julian, Winsley, Matze, perhaps others from Website > infra team) decided to go for a solution that gives more control and > responsibility to the website developers and doesn't increase our dependency on > Mercurial. > > I like your proposal (or my second option from the comment linked above) for > that it makes CMS more aware of the dependencies and more able to ensure that > they are put in place consistently and have right versions. What I don't like > about it is that it makes CMS and the websites more dependent on Mercurial. In > my experience the use of `MercurialSource` has only brought us awkwardness in > testing (with potential missed issues) and problems in production (albeit, > because people had wrong expectations about CMS behavior). > > My preference would be to have a simpler CMS that is just a site generator that > knows nothing about version control and use standard tools to take care of > dependency management. However, my preference is not as important as what > website developers think and how they would prefer to work. So far, my > impression was that they share my preferences. Perhaps Julian and Winsley, who > are also CC'd on this review can comment on this. > > If it matters, from where we are now, implementing Wladimir's proposal in > addition to, or instead of the current approach, would probably take me about a > day of work. I agree with Wladimir. His suggestion seems to be the simplest and most consistent way to make sure the right dependencies are used, and also gives more control to websites developers. If we go with your intial approach, treating all dependencies as local file path, this will defeat the whole purpose of mercurial integration in the first place, and means that developers would have to locally setup the dependencies manually, and on the server somebody would have to do the same along configuring another cronjob that keeps the dependency up to date. Hence website developers wouldn't be able to add/update dependencies on their own, and there is no mechanism that makes sure the right version of a dependency is used. As discussed before, adding support for Git would be trivial, if it is preferred over Mercurial. But making the CMS agnostic of version control is probably a bad idea, even more so if we want to deal with external resources now.
On 2017/09/29 15:55:08, Sebastian Noack wrote: > https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py > File cms/sources.py (right): > > https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc... > cms/sources.py:396: create_source(os.path.join(path, p)) > On 2017/09/29 10:49:09, Vasily Kuznetsov wrote: > > On 2017/09/29 08:50:44, Wladimir Palant wrote: > > > On 2017/09/28 20:48:21, Sebastian Noack wrote: > > > > This seems to be equivalent to FileSource(os.path.join(path, p)), hence > the > > > > recursion would be unnecessary and just makes the code harder to follow. > > > > > > Not quite equivalent. The directory we link to might have additional paths > of > > > its own, so we might get a MultiSource instance here again. Whether such > > nested > > > dependencies make a lot of sense is questionable, but we should support them > > for > > > correctness. > > > > Yes, nested dependencies are the reason why I made `create_source` recursive. > > > > > > Also I wonder whether this would even work in practice. So on the server > > where > > > > MercurialSource is used since the repository isn't checked out these paths > > > > wouldn't exist anyway, or do I miss something? > > > > > > It would work in practice, but in a rather suboptimal way. On the server we > > > would use MercurialSource for the base source but FileSource for its > > > dependencies. So the server would have to update the dependencies to the > right > > > revision, exactly the situation that MercurialSource is meant to avoid. > > > > > > Something else is a bigger concern IMHO. The source repository is no longer > > > self-contained with this change, so whenever you want to test you would have > > to > > > check out its dependencies (what revisions?) to the same location as what > the > > > server uses. Testing will require a rather messy setup now. We've been there > > > with our source code, and it isn't really manageable. That's why we created > > > ensure_dependencies.py. > > > > > > While reusing ensure_dependencies.py here might be complicated, we might > want > > to > > > use a simplified version of the logic. Meaning not additional_paths but > rather > > > inherit_from setting indicating repository location and revision. These > > > repositories should be automatically cloned (with --noupdate flag) as > > > subdirectories within a dependencies/ directory or pulled as necessary, we > can > > > use MercurialSource for them then - both for server and test environment. > > > > This was one of the options in the original discussion in the ticket (see > > https://issues.adblockplus.org/ticket/5336#comment:2). After all the > > deliberations, we (Jon, Julian, Winsley, Matze, perhaps others from Website > > infra team) decided to go for a solution that gives more control and > > responsibility to the website developers and doesn't increase our dependency > on > > Mercurial. > > > > I like your proposal (or my second option from the comment linked above) for > > that it makes CMS more aware of the dependencies and more able to ensure that > > they are put in place consistently and have right versions. What I don't like > > about it is that it makes CMS and the websites more dependent on Mercurial. In > > my experience the use of `MercurialSource` has only brought us awkwardness in > > testing (with potential missed issues) and problems in production (albeit, > > because people had wrong expectations about CMS behavior). > > > > My preference would be to have a simpler CMS that is just a site generator > that > > knows nothing about version control and use standard tools to take care of > > dependency management. However, my preference is not as important as what > > website developers think and how they would prefer to work. So far, my > > impression was that they share my preferences. Perhaps Julian and Winsley, who > > are also CC'd on this review can comment on this. > > > > If it matters, from where we are now, implementing Wladimir's proposal in > > addition to, or instead of the current approach, would probably take me about > a > > day of work. > > I agree with Wladimir. His suggestion seems to be the simplest and most > consistent way to make sure the right dependencies are used, and also gives more > control to websites developers. > > If we go with your intial approach, treating all dependencies as local file > path, this will defeat the whole purpose of mercurial integration in the first > place, and means that developers would have to locally setup the dependencies > manually, and on the server somebody would have to do the same along configuring > another cronjob that keeps the dependency up to date. Hence website developers > wouldn't be able to add/update dependencies on their own, and there is no > mechanism that makes sure the right version of a dependency is used. > > As discussed before, adding support for Git would be trivial, if it is preferred > over Mercurial. But making the CMS agnostic of version control is probably a bad > idea, even more so if we want to deal with external resources now. Hi Sebastian, I agree that Wladimir's proposal fits better with how the CMS currently works. If we rely on Mercurial to obtain the snapshot of the website source from a certain revision, it makes sense to use similar snapshots for the dependencies. Further, if all dependencies are Mercurial repositories and CMS is already managing some Mercurial interaction for us, it makes sense to give it the responsibility of cloning and checking out the dependencies as well. You also point out that depending on directories would make CMS alone insufficient to generate a website from a source repository and you're probably right about this one too. However, if some of our dependencies are not Mercurial repositories and if we would like to use other tools in the build process anyway, our preferences about CMS would change, as they did indeed for the Websites Infrastructure team. The website developers (see Julian's comment above for example, but we've been discussing it for a while now) would like to be able to use any kind of dependencies (e.g. NPM modules) and tools (e.g. SCSS compiler) in the process of website generation. It is possible to try to fit all this functionality into CMS and use it as a multifunctional dependency management and orchestration tool, but this seems like a lot of work and a lot of code to maintain. So the vision we (Website infra team) have is a slim and simple CMS that does only the things that we can't do with common tools. We would like a CMS that plays well with any kind of process and does not force a specific tool or workflow upon the users. The workflow would still be automated, but with the tools that are designed for running tasks, for example gulp. So we would use specialized small tools that play well with each other, sort of like Unix philosophy. This is a more flexible approach that puts website developers in charge of how websites are generated and reduces the amount of wheel reinvention in the CMS so that CMS developers can focus on functionality that doesn't exist in common tools. The functionality that's contained in this review is geared towards this new vision of how websites are to be developed and deployed. It doesn't quite fit with the current approach but this is not important since the websites that will use dependencies will be deployed differently, in a way that works better with directory-based dependencies. Now website developers need support for this in CMS in order to prototype the new workflow. One could say that we could start using dependencies without changing the deployment approach if we implement support for mercurial-based dependencies, but the point of this ticket is to enable the new workflow, not to spend time on some kind of hybrid transitional workflow that will become legacy as soon as it's put into place. In any case, investing additional effort into this compatibility that nobody needs while blocking Julian in the process seems rather counterproductive to me. I hope this clarifies why the ticket is written the way that it is. The implementation here just follows the ticket. Cheers, Vasily
Patch set 4 looks fine. As to managing dependencies, it's a bit questionable whether npm is the right tool when the CMS itself is written in Python. Also, npm tends to be rather lax when it come to getting the right version. But if our webdevs want to work with that, so be it.
I still consider this approach rather questionable. npm doesn't seem to be the right tool to manage website content (potentially even including Python code), not to mention that it makes updating such content unnecessarily complicated, as you have to deal with packaging and releasing any shared web content/assets. Regardless which third-party tool we'd use to manage external resources, working with websites seems to become more complicated, when you suddenly doesn't have to only run one command to test your changes anymore. Yes, you could then introduce a build system (like gulp), making the setup even more complex, but I don't even see how that will integrate, latest when using stuff like SCSS without making the CMS aware of it, since the development server has to handle any generated content on the fly.
https://codereview.adblockplus.org/29555839/diff/29559578/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29559578/cms/sources.py#newc... cms/sources.py:360: def create_source(path, static=False, revision=None): Does it still make sense to call that argument "static"? Whether the output will be statically generated or delivered on the fly is an implementation detail of the calling code. Since now this flag is only used to turn on caching, why not just calling it "cached"?
On 2017/09/29 19:11:20, Sebastian Noack wrote: > I still consider this approach rather questionable. npm doesn't seem to be the > right tool to manage website content (potentially even including Python code), > not to mention that it makes updating such content unnecessarily complicated, as > you have to deal with packaging and releasing any shared web content/assets. > Regardless which third-party tool we'd use to manage external resources, working > with websites seems to become more complicated, when you suddenly doesn't have > to only run one command to test your changes anymore. Yes, you could then > introduce a build system (like gulp), making the setup even more complex, but I > don't even see how that will integrate, latest when using stuff like SCSS > without making the CMS aware of it, since the development server has to handle > any generated content on the fly. Yes, it's not the most obvious way to work if you're coming from "cms manages everything" approach or from ensure_dependencies-based process. Julian could probably explain this better, but I'll attempt sketching out how I imagine it: - Website repositories that have dependencies or build steps (or that are themselves dependencies for other websites) will have `package.json` in them. It will contain metadata and dependencies as needed (remember that npm can refer to repository URLs and revisions). - To generate a website we'll run `npm install` and then `gulp generate` or something like that. Maybe `gulp` can run `npm install` to make it one step. - Npm will put dependencies (including their dependencies recursively) in the place where CMS expects them (i.e. the paths described in `settings.ini`). It will also take care of dependencies that are not websites. - Gulp will run any build steps (SCSS compilers, JS preprocessors, etc.) and then run `generate_static_pages.py`. - To preview we'll run `gulp preview` that will do all the prerequisite stuff and then start CMS in preview mode. It will also observe any changes on the filesystem and rerun the build steps that will update the files that CMS sees. This way we'll get live preview also for files that require build steps. This gives us: - One (maximum two, if I'm wrong) command website generation on developer's machine or on the sever that works in a consistent repeatable way. - Dependencies on other websites (with revisions or without revisions) as well as dependencies on anything else packaged with NPM. - Build steps that also work with live preview. All of this comes almost for free (in terms of our own code) because we use robust existing tools. Seems like not such a horrible deal to me. What do you think? Cheers, Vasily https://codereview.adblockplus.org/29555839/diff/29559578/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29559578/cms/sources.py#newc... cms/sources.py:360: def create_source(path, static=False, revision=None): On 2017/09/29 19:11:30, Sebastian Noack wrote: > Does it still make sense to call that argument "static"? Whether the output will > be statically generated or delivered on the fly is an implementation detail of > the calling code. Since now this flag is only used to turn on caching, why not > just calling it "cached"? Yeah, you're right, now it would be more clear if it's just a caching flag. Done.
On 2017/09/29 19:11:20, Sebastian Noack wrote: > I still consider this approach rather questionable. npm doesn't seem to be the > right tool to manage website content (potentially even including Python code), > not to mention that it makes updating such content unnecessarily complicated, as > you have to deal with packaging and releasing any shared web content/assets. > Regardless which third-party tool we'd use to manage external resources, working > with websites seems to become more complicated, when you suddenly doesn't have > to only run one command to test your changes anymore. Yes, you could then > introduce a build system (like gulp), making the setup even more complex, but I > don't even see how that will integrate, latest when using stuff like SCSS > without making the CMS aware of it, since the development server has to handle > any generated content on the fly. I agree, npm doesn't seem to make much sense. Perhaps it makes more sense to Julian since he is using it more frequently but I understood that npm is not a very straightforward packager. Perhaps I need to look into it more.
On 2017/10/02 18:52:01, Jon Sonesen wrote: > On 2017/09/29 19:11:20, Sebastian Noack wrote: > > I still consider this approach rather questionable. npm doesn't seem to be the > > right tool to manage website content (potentially even including Python code), > > not to mention that it makes updating such content unnecessarily complicated, > as > > you have to deal with packaging and releasing any shared web content/assets. > > Regardless which third-party tool we'd use to manage external resources, > working > > with websites seems to become more complicated, when you suddenly doesn't have > > to only run one command to test your changes anymore. Yes, you could then > > introduce a build system (like gulp), making the setup even more complex, but > I > > don't even see how that will integrate, latest when using stuff like SCSS > > without making the CMS aware of it, since the development server has to handle > > any generated content on the fly. > > I agree, npm doesn't seem to make much sense. Perhaps it makes more sense to > Julian since he is using it more frequently but I understood that npm is not a > very straightforward packager. Perhaps I need to look into it more. Hi Jon, Thanks for the comment. I'm not quite sure what to make of it, so since this is blocking the ticket, let's maybe do something like this: 1. Based on all the comments above, do you think that the text of #5336 should be changed? 2. If `answer[1] is True`, what are your main concerns? Do you have a proposal for reformulating #5336? 3. If `answer[1] is False`, does the implementation here have any problems that I should fix? 4. If `answer[1] is Mu` and perhaps you need more information, maybe we (you, Julian, I, maybe Winsley) should have a chat about it. Cheers, Vasily P.S. See https://en.wikipedia.org/wiki/Mu_(negative)#The_Mu-koan if [4] is confusing ;)
On 2017/10/02 11:11:56, Vasily Kuznetsov wrote: > On 2017/09/29 19:11:20, Sebastian Noack wrote: > > I still consider this approach rather questionable. npm doesn't seem to be the > > right tool to manage website content (potentially even including Python code), > > not to mention that it makes updating such content unnecessarily complicated, > as > > you have to deal with packaging and releasing any shared web content/assets. > > Regardless which third-party tool we'd use to manage external resources, > working > > with websites seems to become more complicated, when you suddenly doesn't have > > to only run one command to test your changes anymore. Yes, you could then > > introduce a build system (like gulp), making the setup even more complex, but > I > > don't even see how that will integrate, latest when using stuff like SCSS > > without making the CMS aware of it, since the development server has to handle > > any generated content on the fly. > > Yes, it's not the most obvious way to work if you're coming from "cms manages > everything" approach or from ensure_dependencies-based process. Julian could > probably explain this better, but I'll attempt sketching out how I imagine it: > > - Website repositories that have dependencies or build steps (or that are > themselves dependencies for other websites) will have `package.json` in them. It > will contain metadata and dependencies as needed (remember that npm can refer to > repository URLs and revisions). > - To generate a website we'll run `npm install` and then `gulp generate` or > something like that. Maybe `gulp` can run `npm install` to make it one step. > - Npm will put dependencies (including their dependencies recursively) in > the place where CMS expects them (i.e. the paths described in `settings.ini`). > It will also take care of dependencies that are not websites. > - Gulp will run any build steps (SCSS compilers, JS preprocessors, etc.) and > then run `generate_static_pages.py`. > - To preview we'll run `gulp preview` that will do all the prerequisite stuff > and then start CMS in preview mode. It will also observe any changes on the > filesystem and rerun the build steps that will update the files that CMS sees. > This way we'll get live preview also for files that require build steps. > > This gives us: > - One (maximum two, if I'm wrong) command website generation on developer's > machine or on the sever that works in a consistent repeatable way. > - Dependencies on other websites (with revisions or without revisions) as well > as dependencies on anything else packaged with NPM. > - Build steps that also work with live preview. > > All of this comes almost for free (in terms of our own code) because we use > robust existing tools. Seems like not such a horrible deal to me. What do you > think? Yes, if this is how it will be implemented, that might work. But all in all this is a quite complex setup, you dream up there. I don't see us getting there any time soon, and whether it will be worth whatever complexity it takes still has to show. On 2017/10/02 18:52:01, Jon Sonesen wrote: > I agree, npm doesn't seem to make much sense. Perhaps it makes more sense to > Julian since he is using it more frequently but I understood that npm is not a > very straightforward packager. Perhaps I need to look into it more. npm is a decent package system for libraries and programs written in JavaScript, that is what it is for (just like pip is for Python). But it doesn't seem to be the right tool to bundle arbitrary assets (and even Python code). Furthermore, I raised some concerns above, about using an external tool for this task in the first place. But well, I won't keep you guys from shooting yourself in the foot, so LGTM. I just hope that I will still be able to edit content on adblockplus.org in the future, without too much pain. :/
> 1. Based on all the comments above, do you think that the text of #5336 should > be changed? No, I guess it is fine how it is since the comments point out the split of responsibilities or "scope" for the cms as well as the future need to uncomplicate that nuance of the cms. Although, I will add that this will increase the need for documentation that is available for all developers not only for the cms but likely for the websites so that (as sebastian pointed out) we will be able to make relatively painless changes to websites if we need to. > 3. If `answer[1] is False`, does the implementation here have any problems that > I should fix? No, I agree that the implementation is functional, so LGTM |