[wp-trac] [WordPress Trac] #12301: built up sitemap
WordPress Trac
noreply at wordpress.org
Sat Sep 1 00:03:04 UTC 2018
#12301: built up sitemap
-----------------------------+-----------------------
Reporter: omponk | Owner: (none)
Type: feature request | Status: reopened
Priority: normal | Milestone:
Component: Administration | Version: 3.0
Severity: normal | Resolution:
Keywords: | Focuses:
-----------------------------+-----------------------
Changes (by igrigorik):
* status: closed => reopened
* resolution: wontfix =>
Comment:
Hi folks.
I'd like to make the case that this ticket should be reopened and
WordPress should provide a sitemap out of the box.
== Why do sitemaps matter?
Sitemaps are a [https://www.sitemaps.org/protocol.html common protocol]
that enables crawlers to: efficiently discover and index content of a
site, identify pages that it otherwise might miss, and quickly detect
updates to previously indexed content. Instead of recrawling every page to
find other pages, or detect if an update has been made, sitemaps enumerate
the pages and their update timestamps. This helps speed up indexing and
content discovery, which benefits greatly both the site owner as well as
the crawler.
All popular search engines support and look for sitemaps when visiting a
site (e.g. see [https://support.google.com/webmasters/answer/156184?hl=en
Google docs]), and having WordPress sites provide one by default would
help the content get indexed and discovered faster.
== What's the story today in WordPress?
(A) Site owner can install a plugin to generate a sitemap. This has
multiple problems...
- The site owner needs to know that such a thing is needed. Most
publishers won't, nor should they ever have to think about it. WordPress
should come with all the batteries included to maximize my success on the
web: having my content be discovered + indexed fast is critical.
- Because sitemaps are generated by 3P plugins today, there is no interop
story between multiple plugins that may want to do this. For example, if
SEO plugin A creates a sitemap, there is no well defined hook for SEO
plugin B to detect this and avoid generating another variant that can
cause both crawler and publisher confusion. We did an audit of popular
plugins and none of them offer any form of detection / compatibility with
other plugins. This can yield unpredictable results for the site owner.
(B) Some crawlers support RSS/Atom feeds in lieu of sitemap. However, this
also has many problems...
- RSS feeds typically only enumerate a small (latest-N) number of pages,
whereas the point of the sitemap is to provide the full list of content
that should be crawled and indexed.
- RSS feeds generated by WP do not list all of the site’s content. For
example, as raised earlier, pages are not listed in the RSS feed. In
theory this could be, once again, fixed by a
[https://wordpress.org/plugins/rss-includes-pages/ 3P plugin], but now
we're back to all the same problems as (A).
- If we did try to bend current RSS feeds to list all content, the
resulting feed could be huge for some sites.
- Sitemaps do not need the actual page content
- Sitemaps can be split into multiple files
In short, neither of these solutions is good enough.
== Wishlist for sitemap support in WordPress
WordPress should provide a sitemap file out of the box, with no action
need by the site owner, same as the RSS feed:
1. WordPress core is responsible for generating standard XML sitemap.
1. Sitemaps should respect size limits [https://searchengineland.com
/google-bing-increase-file-size-limit-sitemaps-files-264338 enforced by
popular search engines] and split large sitemaps.
1. WordPress core should provide a standard API to enumerate available
sitemaps, such that plugins can use this API to retrieve and submit these
sitemaps on publishers behalf to relevant providers.
1. WordPress core should provide a notification hook for when a sitemap
file is updated, to assist with (3). When a page, post, or other type of
publicly accessible content is created or updated, an internal event
should be fired.
--
Ticket URL: <https://core.trac.wordpress.org/ticket/12301#comment:9>
WordPress Trac <https://core.trac.wordpress.org/>
WordPress publishing platform
More information about the wp-trac
mailing list