Issue 29555839: Issue 5336 - Allow additional include, page, and template paths using CMS

Vasily Kuznetsov

Hi Jon, Sebastian and Julian! Here's the implementation of the "additional paths" feature. I created ...

Sept. 26, 2017, 1:26 p.m. (2017-09-26 13:26:47 UTC) #1

Sebastian Noack

https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py#newcode335 cms/sources.py:335: for base in self._bases: Nit: You could use any() ...

Sept. 27, 2017, 2:47 a.m. (2017-09-27 02:47:53 UTC) #2

Vasily Kuznetsov

Hi Sebastian, Thank you for the comments. I've implemented the first two and replied to ...

Sept. 27, 2017, 11:34 a.m. (2017-09-27 11:34:39 UTC) #3

Hi Sebastian,

Thank you for the comments. I've implemented the first two and replied to the
third (see below).

Cheers,
Vasily

https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py
File cms/sources.py (right):

https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py#newc...
cms/sources.py:335: for base in self._bases:
On 2017/09/27 02:47:53, Sebastian Noack wrote:
> Nit: You could use any() here as well:
> 
>   return any(base.has_file(filename) for base in self._bases)

Done.

https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py#newc...
cms/sources.py:350: return sorted(files)
On 2017/09/27 02:47:53, Sebastian Noack wrote:
> Nit: This could be written as a set literal one-liner:
> 
>   return {f for base in self._bases for f in base.list_files(subdir)}
> 
> Any reason why you sort? It seems the other sources also yield results in more
> or less arbitrary order.

Done.

Thanks for this suggestion. Having thought about it again, there's no
particularly good reason to sort. Sorting makes files in the same subdirectory
appear together, even if they are coming from different base sources, but this
doesn't seem to be used anywhere, so it's not really important.

What might be important is the type of the return value. Sorted returns a list
while your proposal returns a set. I wrapped the set in a list for now, to be on
the safe side. Let me know if you think this is not necessary.

https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py#newc...
cms/sources.py:366: def create_source(path, static=False, revision='default'):
On 2017/09/27 02:47:53, Sebastian Noack wrote:
> I would rather implement this in Source.__new__, so that the calling code
would
> basically remain the same, i.e:
> 
>   FileSource(args.path)
>   MercurialSource(repo, revision)
> 
> vs:
> 
>   create_source(args.path)
>   create_source(repo, static=True, revision=revision)
> 
> This not only preserves a stable API (which probably isn't too important), but
> that way you also don't need to introduce a boolean flag which you then map to
> hard-code types, but you can just delegate to cls (first arg of __new__).

Here I prefer my implementation for the following reasons:

1. Overriding __new__ is a somewhat advanced technique that should not be used
without a good reason. I don't think we have a good reason in this case.

2. Constructors of `FileSource` and `MercurialSource` would end up sometimes
returning an object of a different class from the type the client is trying to
construct (e.g. `MultiSource` instead of `MercurialSource`). This is not the end
of the world, but confusing and better avoided.

3. The design of how `generate_static_files.py` and `test_server.py` are
constructing the source could be improved and I'm using the opportunity to start
doing it. Ideally, the code that instantiates and configures the sources would
be the same in dynamic and static cases: this ensures consistency and makes it
easier to change and test it. There was not much code in the dynamic case, only
`FileSource` instantiation, but with the change, we now make it clear that what
gets created is not necessarily a `FileSource`, it's just some subclass of
`Source`. Expectations are more clear = good! In the static case there was
caching code, that I'm not quite comfortable with because it messes too much
with the class internals -- now it's moved to `create_source` function, closer
to the source classes so the dependency is more obvious. It could be further
improved by introducing a `CachingSource`: a proxy source that completely
encapsulates the caching behavior, but let's hold our horses for now and just be
content that it's become a trivial change inside `sources.py`.

4. `static` flag is a meaningful part of the API. It tells the source
constructing machinery that we're going to be doing a one-off static generation
and not serving pages on demand, where the changes on the disk should be
observed. Right now it has the effect of enabling caching in `create_source` and
choosing `MercurialSource` instead of `FileSource` (which I'm not too happy
about, but for now we keep the behavior) but it might become something else
later. The meaning of `static` parameter is that the client is not interested in
tracking changes on the disk after the source is created and how that's ensured
is the internal business of `create_source`.

Sebastian Noack

https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29555840/cms/sources.py#newcode350 cms/sources.py:350: return sorted(files) On 2017/09/27 11:34:39, Vasily Kuznetsov wrote: > ...

Sept. 27, 2017, 3:55 p.m. (2017-09-27 15:55:22 UTC) #4

Vasily Kuznetsov

I removed the conversion of set to list in `MultiSource.list_files` following Sebastian's suggestion and improved ...

Sept. 28, 2017, 9:08 a.m. (2017-09-28 09:08:17 UTC) #5

Sebastian Noack

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newcode396 cms/sources.py:396: create_source(os.path.join(path, p)) This seems to be equivalent to FileSource(os.path.join(path, ...

Sept. 28, 2017, 8:48 p.m. (2017-09-28 20:48:21 UTC) #6

Wladimir Palant

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newcode360 cms/sources.py:360: def create_source(path, static=False, revision='default'): The default revision should be ...

Sept. 29, 2017, 8:50 a.m. (2017-09-29 08:50:44 UTC) #7

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py
File cms/sources.py (right):

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc...
cms/sources.py:360: def create_source(path, static=False, revision='default'):
The default revision should be 'master', not 'default'.

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc...
cms/sources.py:382: if static:
Should this flag really determine which type of source we choose? I'd expect
this to be dependent on the `revision` parameter - the default of this parameter
should be None, indicating a FileSource. And if some value is passed in, we
should use MercurialSource.

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc...
cms/sources.py:391: except ConfigParser.NoSectionError:
What about NoOptionError?

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc...
cms/sources.py:396: create_source(os.path.join(path, p))
On 2017/09/28 20:48:21, Sebastian Noack wrote:
> This seems to be equivalent to FileSource(os.path.join(path, p)), hence the
> recursion would be unnecessary and just makes the code harder to follow.

Not quite equivalent. The directory we link to might have additional paths of
its own, so we might get a MultiSource instance here again. Whether such nested
dependencies make a lot of sense is questionable, but we should support them for
correctness.

> Also I wonder whether this would even work in practice. So on the server where
> MercurialSource is used since the repository isn't checked out these paths
> wouldn't exist anyway, or do I miss something?

It would work in practice, but in a rather suboptimal way. On the server we
would use MercurialSource for the base source but FileSource for its
dependencies. So the server would have to update the dependencies to the right
revision, exactly the situation that MercurialSource is meant to avoid.

Something else is a bigger concern IMHO. The source repository is no longer
self-contained with this change, so whenever you want to test you would have to
check out its dependencies (what revisions?) to the same location as what the
server uses. Testing will require a rather messy setup now. We've been there
with our source code, and it isn't really manageable. That's why we created
ensure_dependencies.py.

While reusing ensure_dependencies.py here might be complicated, we might want to
use a simplified version of the logic. Meaning not additional_paths but rather
inherit_from setting indicating repository location and revision. These
repositories should be automatically cloned (with --noupdate flag) as
subdirectories within a dependencies/ directory or pulled as necessary, we can
use MercurialSource for them then - both for server and test environment.

Vasily Kuznetsov

Hi Wladimir, Thank you for the comments. I have addressed all except for the last ...

Sept. 29, 2017, 10:49 a.m. (2017-09-29 10:49:10 UTC) #8

Hi Wladimir,

Thank you for the comments. I have addressed all except for the last one. That
one will need some discussion to decide how to proceed.

Julian and Winsley,

Could you please comment on your preferences as to whether the CMS should rely
on Mercurial to put the dependencies in place (in this case we would depend on
repository url + revision instead of a path like now)? We had this discussion
before, in the ticket, and I have referenced it here. See the bottom of this
message for more detail.

Cheers,
Vasily

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py
File cms/sources.py (right):

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc...
cms/sources.py:360: def create_source(path, static=False, revision='default'):
On 2017/09/29 08:50:44, Wladimir Palant wrote:
> The default revision should be 'master', not 'default'.

Changed to `None`, following your next comment. Done

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc...
cms/sources.py:382: if static:
On 2017/09/29 08:50:44, Wladimir Palant wrote:
> Should this flag really determine which type of source we choose? I'd expect
> this to be dependent on the `revision` parameter - the default of this
parameter
> should be None, indicating a FileSource. And if some value is passed in, we
> should use MercurialSource.

Yeah, you're right. `static` parameter should control the caching and `revision`
parameter (or its absence) should select the source type.

Done.

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc...
cms/sources.py:391: except ConfigParser.NoSectionError:
On 2017/09/29 08:50:44, Wladimir Palant wrote:
> What about NoOptionError?

Thank you. Done.

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc...
cms/sources.py:396: create_source(os.path.join(path, p))
On 2017/09/29 08:50:44, Wladimir Palant wrote:
> On 2017/09/28 20:48:21, Sebastian Noack wrote:
> > This seems to be equivalent to FileSource(os.path.join(path, p)), hence the
> > recursion would be unnecessary and just makes the code harder to follow.
> 
> Not quite equivalent. The directory we link to might have additional paths of
> its own, so we might get a MultiSource instance here again. Whether such
nested
> dependencies make a lot of sense is questionable, but we should support them
for
> correctness.

Yes, nested dependencies are the reason why I made `create_source` recursive.

> > Also I wonder whether this would even work in practice. So on the server
where
> > MercurialSource is used since the repository isn't checked out these paths
> > wouldn't exist anyway, or do I miss something?
> 
> It would work in practice, but in a rather suboptimal way. On the server we
> would use MercurialSource for the base source but FileSource for its
> dependencies. So the server would have to update the dependencies to the right
> revision, exactly the situation that MercurialSource is meant to avoid.
> 
> Something else is a bigger concern IMHO. The source repository is no longer
> self-contained with this change, so whenever you want to test you would have
to
> check out its dependencies (what revisions?) to the same location as what the
> server uses. Testing will require a rather messy setup now. We've been there
> with our source code, and it isn't really manageable. That's why we created
> ensure_dependencies.py.
> 
> While reusing ensure_dependencies.py here might be complicated, we might want
to
> use a simplified version of the logic. Meaning not additional_paths but rather
> inherit_from setting indicating repository location and revision. These
> repositories should be automatically cloned (with --noupdate flag) as
> subdirectories within a dependencies/ directory or pulled as necessary, we can
> use MercurialSource for them then - both for server and test environment.

This was one of the options in the original discussion in the ticket (see
https://issues.adblockplus.org/ticket/5336#comment:2). After all the
deliberations, we (Jon, Julian, Winsley, Matze, perhaps others from Website
infra team) decided to go for a solution that gives more control and
responsibility to the website developers and doesn't increase our dependency on
Mercurial.

I like your proposal (or my second option from the comment linked above) for
that it makes CMS more aware of the dependencies and more able to ensure that
they are put in place consistently and have right versions. What I don't like
about it is that it makes CMS and the websites more dependent on Mercurial. In
my experience the use of `MercurialSource` has only brought us awkwardness in
testing (with potential missed issues) and problems in production (albeit,
because people had wrong expectations about CMS behavior).

My preference would be to have a simpler CMS that is just a site generator that
knows nothing about version control and use standard tools to take care of
dependency management. However, my preference is not as important as what
website developers think and how they would prefer to work. So far, my
impression was that they share my preferences. Perhaps Julian and Winsley, who
are also CC'd on this review can comment on this.

If it matters, from where we are now, implementing Wladimir's proposal in
addition to, or instead of the current approach, would probably take me about a
day of work.

juliandoucette

> Could you please comment on your preferences as to whether the CMS should rely ...

Sept. 29, 2017, 1:34 p.m. (2017-09-29 13:34:05 UTC) #9

Sebastian Noack

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newcode396 cms/sources.py:396: create_source(os.path.join(path, p)) On 2017/09/29 10:49:09, Vasily Kuznetsov wrote: > ...

Sept. 29, 2017, 3:55 p.m. (2017-09-29 15:55:08 UTC) #10

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py
File cms/sources.py (right):

https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc...
cms/sources.py:396: create_source(os.path.join(path, p))
On 2017/09/29 10:49:09, Vasily Kuznetsov wrote:
> On 2017/09/29 08:50:44, Wladimir Palant wrote:
> > On 2017/09/28 20:48:21, Sebastian Noack wrote:
> > > This seems to be equivalent to FileSource(os.path.join(path, p)), hence
the
> > > recursion would be unnecessary and just makes the code harder to follow.
> > 
> > Not quite equivalent. The directory we link to might have additional paths
of
> > its own, so we might get a MultiSource instance here again. Whether such
> nested
> > dependencies make a lot of sense is questionable, but we should support them
> for
> > correctness.
> 
> Yes, nested dependencies are the reason why I made `create_source` recursive.
> 
> > > Also I wonder whether this would even work in practice. So on the server
> where
> > > MercurialSource is used since the repository isn't checked out these paths
> > > wouldn't exist anyway, or do I miss something?
> > 
> > It would work in practice, but in a rather suboptimal way. On the server we
> > would use MercurialSource for the base source but FileSource for its
> > dependencies. So the server would have to update the dependencies to the
right
> > revision, exactly the situation that MercurialSource is meant to avoid.
> > 
> > Something else is a bigger concern IMHO. The source repository is no longer
> > self-contained with this change, so whenever you want to test you would have
> to
> > check out its dependencies (what revisions?) to the same location as what
the
> > server uses. Testing will require a rather messy setup now. We've been there
> > with our source code, and it isn't really manageable. That's why we created
> > ensure_dependencies.py.
> > 
> > While reusing ensure_dependencies.py here might be complicated, we might
want
> to
> > use a simplified version of the logic. Meaning not additional_paths but
rather
> > inherit_from setting indicating repository location and revision. These
> > repositories should be automatically cloned (with --noupdate flag) as
> > subdirectories within a dependencies/ directory or pulled as necessary, we
can
> > use MercurialSource for them then - both for server and test environment.
> 
> This was one of the options in the original discussion in the ticket (see
> https://issues.adblockplus.org/ticket/5336#comment:2). After all the
> deliberations, we (Jon, Julian, Winsley, Matze, perhaps others from Website
> infra team) decided to go for a solution that gives more control and
> responsibility to the website developers and doesn't increase our dependency
on
> Mercurial.
> 
> I like your proposal (or my second option from the comment linked above) for
> that it makes CMS more aware of the dependencies and more able to ensure that
> they are put in place consistently and have right versions. What I don't like
> about it is that it makes CMS and the websites more dependent on Mercurial. In
> my experience the use of `MercurialSource` has only brought us awkwardness in
> testing (with potential missed issues) and problems in production (albeit,
> because people had wrong expectations about CMS behavior).
> 
> My preference would be to have a simpler CMS that is just a site generator
that
> knows nothing about version control and use standard tools to take care of
> dependency management. However, my preference is not as important as what
> website developers think and how they would prefer to work. So far, my
> impression was that they share my preferences. Perhaps Julian and Winsley, who
> are also CC'd on this review can comment on this.
> 
> If it matters, from where we are now, implementing Wladimir's proposal in
> addition to, or instead of the current approach, would probably take me about
a
> day of work.

I agree with Wladimir. His suggestion seems to be the simplest and most
consistent way to make sure the right dependencies are used, and also gives more
control to websites developers.

If we go with your intial approach, treating all dependencies as local file
path, this will defeat the whole purpose of mercurial integration in the first
place, and means that developers would have to locally setup the dependencies
manually, and on the server somebody would have to do the same along configuring
another cronjob that keeps the dependency up to date. Hence website developers
wouldn't be able to add/update dependencies on their own, and there is no
mechanism that makes sure the right version of a dependency is used.

As discussed before, adding support for Git would be trivial, if it is preferred
over Mercurial. But making the CMS agnostic of version control is probably a bad
idea, even more so if we want to deal with external resources now.

Vasily Kuznetsov

On 2017/09/29 15:55:08, Sebastian Noack wrote: > https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py > File cms/sources.py (right): > > https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newcode396 ...

Sept. 29, 2017, 5:28 p.m. (2017-09-29 17:28:53 UTC) #11

On 2017/09/29 15:55:08, Sebastian Noack wrote:
> https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py
> File cms/sources.py (right):
> 
>
https://codereview.adblockplus.org/29555839/diff/29558570/cms/sources.py#newc...
> cms/sources.py:396: create_source(os.path.join(path, p))
> On 2017/09/29 10:49:09, Vasily Kuznetsov wrote:
> > On 2017/09/29 08:50:44, Wladimir Palant wrote:
> > > On 2017/09/28 20:48:21, Sebastian Noack wrote:
> > > > This seems to be equivalent to FileSource(os.path.join(path, p)), hence
> the
> > > > recursion would be unnecessary and just makes the code harder to follow.
> > > 
> > > Not quite equivalent. The directory we link to might have additional paths
> of
> > > its own, so we might get a MultiSource instance here again. Whether such
> > nested
> > > dependencies make a lot of sense is questionable, but we should support
them
> > for
> > > correctness.
> > 
> > Yes, nested dependencies are the reason why I made `create_source`
recursive.
> > 
> > > > Also I wonder whether this would even work in practice. So on the server
> > where
> > > > MercurialSource is used since the repository isn't checked out these
paths
> > > > wouldn't exist anyway, or do I miss something?
> > > 
> > > It would work in practice, but in a rather suboptimal way. On the server
we
> > > would use MercurialSource for the base source but FileSource for its
> > > dependencies. So the server would have to update the dependencies to the
> right
> > > revision, exactly the situation that MercurialSource is meant to avoid.
> > > 
> > > Something else is a bigger concern IMHO. The source repository is no
longer
> > > self-contained with this change, so whenever you want to test you would
have
> > to
> > > check out its dependencies (what revisions?) to the same location as what
> the
> > > server uses. Testing will require a rather messy setup now. We've been
there
> > > with our source code, and it isn't really manageable. That's why we
created
> > > ensure_dependencies.py.
> > > 
> > > While reusing ensure_dependencies.py here might be complicated, we might
> want
> > to
> > > use a simplified version of the logic. Meaning not additional_paths but
> rather
> > > inherit_from setting indicating repository location and revision. These
> > > repositories should be automatically cloned (with --noupdate flag) as
> > > subdirectories within a dependencies/ directory or pulled as necessary, we
> can
> > > use MercurialSource for them then - both for server and test environment.
> > 
> > This was one of the options in the original discussion in the ticket (see
> > https://issues.adblockplus.org/ticket/5336#comment:2). After all the
> > deliberations, we (Jon, Julian, Winsley, Matze, perhaps others from Website
> > infra team) decided to go for a solution that gives more control and
> > responsibility to the website developers and doesn't increase our dependency
> on
> > Mercurial.
> > 
> > I like your proposal (or my second option from the comment linked above) for
> > that it makes CMS more aware of the dependencies and more able to ensure
that
> > they are put in place consistently and have right versions. What I don't
like
> > about it is that it makes CMS and the websites more dependent on Mercurial.
In
> > my experience the use of `MercurialSource` has only brought us awkwardness
in
> > testing (with potential missed issues) and problems in production (albeit,
> > because people had wrong expectations about CMS behavior).
> > 
> > My preference would be to have a simpler CMS that is just a site generator
> that
> > knows nothing about version control and use standard tools to take care of
> > dependency management. However, my preference is not as important as what
> > website developers think and how they would prefer to work. So far, my
> > impression was that they share my preferences. Perhaps Julian and Winsley,
who
> > are also CC'd on this review can comment on this.
> > 
> > If it matters, from where we are now, implementing Wladimir's proposal in
> > addition to, or instead of the current approach, would probably take me
about
> a
> > day of work.
> 
> I agree with Wladimir. His suggestion seems to be the simplest and most
> consistent way to make sure the right dependencies are used, and also gives
more
> control to websites developers.
> 
> If we go with your intial approach, treating all dependencies as local file
> path, this will defeat the whole purpose of mercurial integration in the first
> place, and means that developers would have to locally setup the dependencies
> manually, and on the server somebody would have to do the same along
configuring
> another cronjob that keeps the dependency up to date. Hence website developers
> wouldn't be able to add/update dependencies on their own, and there is no
> mechanism that makes sure the right version of a dependency is used.
> 
> As discussed before, adding support for Git would be trivial, if it is
preferred
> over Mercurial. But making the CMS agnostic of version control is probably a
bad
> idea, even more so if we want to deal with external resources now.

Hi Sebastian,

I agree that Wladimir's proposal fits better with how the CMS currently works.
If we rely on Mercurial to obtain the snapshot of the website source from a
certain revision, it makes sense to use similar snapshots for the dependencies.
Further, if all dependencies are Mercurial repositories and CMS is already
managing some Mercurial interaction for us, it makes sense to give it the
responsibility of cloning and checking out the dependencies as well. You also
point out that depending on directories would make CMS alone insufficient to
generate a website from a source repository and you're probably right about this
one too.

However, if some of our dependencies are not Mercurial repositories and if we
would like to use other tools in the build process anyway, our preferences about
CMS would change, as they did indeed for the Websites Infrastructure team. The
website developers (see Julian's comment above for example, but we've been
discussing it for a while now) would like to be able to use any kind of
dependencies (e.g. NPM modules) and tools (e.g. SCSS compiler) in the process of
website generation. It is possible to try to fit all this functionality into CMS
and use it as a multifunctional dependency management and orchestration tool,
but this seems like a lot of work and a lot of code to maintain. So the vision
we (Website infra team) have is a slim and simple CMS that does only the things
that we can't do with common tools. We would like a CMS that plays well with any
kind of process and does not force a specific tool or workflow upon the users.
The workflow would still be automated, but with the tools that are designed for
running tasks, for example gulp. So we would use specialized small tools that
play well with each other, sort of like Unix philosophy. This is a more flexible
approach that puts website developers in charge of how websites are generated
and reduces the amount of wheel reinvention in the CMS so that CMS developers
can focus on functionality that doesn't exist in common tools.

The functionality that's contained in this review is geared towards this new
vision of how websites are to be developed and deployed. It doesn't quite fit
with the current approach but this is not important since the websites that will
use dependencies will be deployed differently, in a way that works better with
directory-based dependencies. Now website developers need support for this in
CMS in order to prototype the new workflow. One could say that we could start
using dependencies without changing the deployment approach if we implement
support for mercurial-based dependencies, but the point of this ticket is to
enable the new workflow, not to spend time on some kind of hybrid transitional
workflow that will become legacy as soon as it's put into place. In any case,
investing additional effort into this compatibility that nobody needs while
blocking Julian in the process seems rather counterproductive to me.

I hope this clarifies why the ticket is written the way that it is. The
implementation here just follows the ticket.

Cheers,
Vasily

Wladimir Palant

Patch set 4 looks fine. As to managing dependencies, it's a bit questionable whether npm ...

Sept. 29, 2017, 6:08 p.m. (2017-09-29 18:08:28 UTC) #12

Sebastian Noack

I still consider this approach rather questionable. npm doesn't seem to be the right tool ...

Sept. 29, 2017, 7:11 p.m. (2017-09-29 19:11:20 UTC) #13

Sebastian Noack

https://codereview.adblockplus.org/29555839/diff/29559578/cms/sources.py File cms/sources.py (right): https://codereview.adblockplus.org/29555839/diff/29559578/cms/sources.py#newcode360 cms/sources.py:360: def create_source(path, static=False, revision=None): Does it still make sense ...

Sept. 29, 2017, 7:11 p.m. (2017-09-29 19:11:30 UTC) #14

Vasily Kuznetsov

On 2017/09/29 19:11:20, Sebastian Noack wrote: > I still consider this approach rather questionable. npm ...

Oct. 2, 2017, 11:11 a.m. (2017-10-02 11:11:56 UTC) #15

On 2017/09/29 19:11:20, Sebastian Noack wrote:
> I still consider this approach rather questionable. npm doesn't seem to be the
> right tool to manage website content (potentially even including Python code),
> not to mention that it makes updating such content unnecessarily complicated,
as
> you have to deal with packaging and releasing any shared web content/assets.
> Regardless which third-party tool we'd use to manage external resources,
working
> with websites seems to become more complicated, when you suddenly doesn't have
> to only run one command to test your changes anymore. Yes, you could then
> introduce a build system (like gulp), making the setup even more complex, but
I
> don't even see how that will integrate, latest when using stuff like SCSS
> without making the CMS aware of it, since the development server has to handle
> any generated content on the fly.

Yes, it's not the most obvious way to work if you're coming from "cms manages
everything" approach or from ensure_dependencies-based process. Julian could
probably explain this better, but I'll attempt sketching out how I imagine it:

- Website repositories that have dependencies or build steps (or that are
themselves dependencies for other websites) will have `package.json` in them. It
will contain metadata and dependencies as needed (remember that npm can refer to
repository URLs and revisions).
- To generate a website we'll run `npm install` and then `gulp generate` or
something like that. Maybe `gulp` can run `npm install` to make it one step.
    - Npm will put dependencies (including their dependencies recursively) in
the place where CMS expects them (i.e. the paths described in `settings.ini`).
It will also take care of dependencies that are not websites.
    - Gulp will run any build steps (SCSS compilers, JS preprocessors, etc.) and
then run `generate_static_pages.py`.
- To preview we'll run `gulp preview` that will do all the prerequisite stuff
and then start CMS in preview mode. It will also observe any changes on the
filesystem and rerun the build steps that will update the files that CMS sees.
This way we'll get live preview also for files that require build steps.

This gives us:
- One (maximum two, if I'm wrong) command website generation on developer's
machine or on the sever that works in a consistent repeatable way.
- Dependencies on other websites (with revisions or without revisions) as well
as dependencies on anything else packaged with NPM.
- Build steps that also work with live preview.

All of this comes almost for free (in terms of our own code) because we use
robust existing tools. Seems like not such a horrible deal to me. What do you
think?

Cheers,
Vasily

https://codereview.adblockplus.org/29555839/diff/29559578/cms/sources.py
File cms/sources.py (right):

https://codereview.adblockplus.org/29555839/diff/29559578/cms/sources.py#newc...
cms/sources.py:360: def create_source(path, static=False, revision=None):
On 2017/09/29 19:11:30, Sebastian Noack wrote:
> Does it still make sense to call that argument "static"? Whether the output
will
> be statically generated or delivered on the fly is an implementation detail of
> the calling code. Since now this flag is only used to turn on caching, why not
> just calling it "cached"?

Yeah, you're right, now it would be more clear if it's just a caching flag.

Done.

Jon Sonesen

On 2017/09/29 19:11:20, Sebastian Noack wrote: > I still consider this approach rather questionable. npm ...

Oct. 2, 2017, 6:52 p.m. (2017-10-02 18:52:01 UTC) #16

Vasily Kuznetsov

On 2017/10/02 18:52:01, Jon Sonesen wrote: > On 2017/09/29 19:11:20, Sebastian Noack wrote: > > ...

Oct. 2, 2017, 7:50 p.m. (2017-10-02 19:50:37 UTC) #17

Sebastian Noack

On 2017/10/02 11:11:56, Vasily Kuznetsov wrote: > On 2017/09/29 19:11:20, Sebastian Noack wrote: > > ...

Oct. 2, 2017, 11:51 p.m. (2017-10-02 23:51:42 UTC) #18

On 2017/10/02 11:11:56, Vasily Kuznetsov wrote:
> On 2017/09/29 19:11:20, Sebastian Noack wrote:
> > I still consider this approach rather questionable. npm doesn't seem to be
the
> > right tool to manage website content (potentially even including Python
code),
> > not to mention that it makes updating such content unnecessarily
complicated,
> as
> > you have to deal with packaging and releasing any shared web content/assets.
> > Regardless which third-party tool we'd use to manage external resources,
> working
> > with websites seems to become more complicated, when you suddenly doesn't
have
> > to only run one command to test your changes anymore. Yes, you could then
> > introduce a build system (like gulp), making the setup even more complex,
but
> I
> > don't even see how that will integrate, latest when using stuff like SCSS
> > without making the CMS aware of it, since the development server has to
handle
> > any generated content on the fly.
> 
> Yes, it's not the most obvious way to work if you're coming from "cms manages
> everything" approach or from ensure_dependencies-based process. Julian could
> probably explain this better, but I'll attempt sketching out how I imagine it:
> 
> - Website repositories that have dependencies or build steps (or that are
> themselves dependencies for other websites) will have `package.json` in them.
It
> will contain metadata and dependencies as needed (remember that npm can refer
to
> repository URLs and revisions).
> - To generate a website we'll run `npm install` and then `gulp generate` or
> something like that. Maybe `gulp` can run `npm install` to make it one step.
>     - Npm will put dependencies (including their dependencies recursively) in
> the place where CMS expects them (i.e. the paths described in `settings.ini`).
> It will also take care of dependencies that are not websites.
>     - Gulp will run any build steps (SCSS compilers, JS preprocessors, etc.)
and
> then run `generate_static_pages.py`.
> - To preview we'll run `gulp preview` that will do all the prerequisite stuff
> and then start CMS in preview mode. It will also observe any changes on the
> filesystem and rerun the build steps that will update the files that CMS sees.
> This way we'll get live preview also for files that require build steps.
> 
> This gives us:
> - One (maximum two, if I'm wrong) command website generation on developer's
> machine or on the sever that works in a consistent repeatable way.
> - Dependencies on other websites (with revisions or without revisions) as well
> as dependencies on anything else packaged with NPM.
> - Build steps that also work with live preview.
> 
> All of this comes almost for free (in terms of our own code) because we use
> robust existing tools. Seems like not such a horrible deal to me. What do you
> think?

Yes, if this is how it will be implemented, that might work. But all in all this
is a quite complex setup, you dream up there. I don't see us getting there any
time soon, and whether it will be worth whatever complexity it takes still has
to show.

On 2017/10/02 18:52:01, Jon Sonesen wrote:
> I agree, npm doesn't seem to make much sense. Perhaps it makes more sense to
> Julian since he is using it more frequently but I understood that npm is not a
> very straightforward packager. Perhaps I need to look into it more.

npm is a decent package system for libraries and programs written in JavaScript,
that is what it is for (just like pip is for Python). But it doesn't seem to be
the right tool to bundle arbitrary assets (and even Python code). Furthermore, I
raised some concerns above, about using an external tool for this task in the
first place.

But well, I won't keep you guys from shooting yourself in the foot, so LGTM. I
just hope that I will still be able to edit content on adblockplus.org in the
future, without too much pain. :/

Jon Sonesen

Oct. 3, 2017, 7:51 p.m. (2017-10-03 19:51:12 UTC) #19

> 1. Based on all the comments above, do you think that the text of #5336 should
> be changed?

No, I guess it is fine how it is since the comments point out the split of
responsibilities or "scope" for the cms as well as the future need to
uncomplicate that nuance of the cms. Although, I will add that this will
increase the need for documentation that is available for all developers not
only for the cms but likely for the websites so that (as sebastian pointed out)
we will be able to make relatively painless changes to websites if we need to.

> 3. If `answer[1] is False`, does the implementation here have any problems
that
> I should fix?

No, I agree that the implementation is functional, so LGTM

Issue 29555839: Issue 5336 - Allow additional include, page, and template paths using CMS (Closed)

Description

Patch Set 1 #

Patch Set 2 : Address comments on PS1 #

Patch Set 3 : Address comments on PS2, improve docstring of create_source #

Patch Set 4 : Address comments on PS3 #

Patch Set 5 : Rename 'static' parameter to 'cached' #

Messages