My first observation: since I’ve been developing Plone sites for a couple of years now, I’m not really part of the targeted audience. The book is meant for people without Plone experience and the reader doesn’t need any programming or CMS knowledge.
As a result the book starts with an introduction to Plone (and Python) and guides the reader through the installation of an instance. While the unified installer is demonstrated, the book also shows buildout. The latter may seem unnecessary for the target audience, but later chapters also use buildout, for instance to install add-on products.
Chapters three through eight are devoted to core concepts of Plone: managing content, configuration, users, groups, workflow, security, and so on. These chapters give the reader a good overview of what Plone has to offer for intranets. The author also suggests a number of third party products to add even more functionality (see the table of contents for details).
Chapter ten is where the book surprised me a bit. Suddenly the author starts developing a product himself. This wasn’t exactly what I had suspected given the target audience. Chapter eleven switches back to using the available functionality (like content rules, WebDAV and external editing). Yet in chapter twelve there’s more code: subjects like TAL, METAL, viewlets, acquisition and resource registries are briefly explained.
The last chapter is devoted to deploying the intranet. It demonstrates Apache configurations, but also discusses the usage of a ZEO server/client setup, load balancing and caching.
It’s hard for me to put myself in the shoes of someone unknown with programming, Plone or even a CMS, so I might underestimate the target audience of this book. However, I feel that the chapters about creating a custom product and theme (chapters ten and twelve) might not be suitable for them and leave them with a lot of questions. On the other hand, the author has experience training non-technical end-users and so let’s give him the benefit of the doubt.
By the way, don’t let the fact that the book is based on Plone 3 scare you: the author frequently points out where it differs from Plone 4.
Overall Plone 3 Intranets is a good introduction and a really nice overview of the broad functionality Plone has to offer. After reading the book you should probably be able to get a nice intranet up and running with the practical tips and examples in the book.
]]>First up, crawler access (in the Site configuration menu). This
item offers three tools: you can generate a robots.txt
file, test a
robots.txt
file and request to remove a URL from the Google search
results.
For those unfamiliar with a robots.txt
file: it is a way to give
instructions to a robot,
like the ones used by search engines to index your website. With the
tools in the crawler access section you can easily create such a file
to e.g. restrict access to your internal search pages. Besides
creating the robots.txt
file, you can also test it by specifying a
URL and the tool will tell you whether it is allowed to access the
page or not. (Note: this is not a security mechanism. Robots can
ignore the instructions.)
In my case I only used the testing tool since I already had a
robots.txt
file and just needed to adjust it.
The next thing I’d like to discuss are the crawl errors (in the Diagnostics menu). Among other things, it shows you which pages Google expected to be on your site but could not find. This was useful since there were a number of pages that were not available anymore after migrating from Plone to Django. (Especially pages that I forgot were accessible.) For instance:
/author/markvl
which is now just /about
/front-page
(the default page of the Plone site root) is now just /
Sure, I could have found those myself (and I probably should have), but it’s very easy to miss URLs that are referenced somewhere. This tool efficiently lists those missing pages, so I could add some rewrite rules in my web server configuration to solve the problems. (Note that the results can change, so you might want to check them regularly.)
By the way, it’s also a nice way to catch mistakes in the content: in once case I had a typo in the link from one blog entry to another. It popped up in the crawl errors and I was able to correct this.
The crawl stats (also in the Diagnostics menu) display the Googlebot activity of the last 90 days. While there’s only that much you can do to control the way Google crawls your site, it was funny to see how some changes made an impact.
The first change, which is visible in all three graphs, is the migration to Django on May 31. Google clearly started crawling more pages since the end of May. As a logical result more bandwidth is consumed. However, less time is spent downloading a page.
The second change is most clear in the middle graph (kilobytes downloaded per day). I moved to a different server and enabled gzip compression on June 25. The move was planned but I might have forgotten about the compression if I hadn’t seen this graph.
This article focussed only on a few of the available tools. There’s much more to discover, like an overview of the queries that returned pages from your site or the ability to upload a sitemap (and being able to view how many of the pages in the sitemap have been indexed).
I really think that using the webmaster tools is a valuable addition to a webmaster’s toolkit.
]]>