Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test AMP compatibility of entire site #1183

Merged
merged 91 commits into from
Aug 29, 2018
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
91 commits
Select commit Hold shift + click to select a range
305c28b
Begin testing of site urls, starting with helper method.
kienstra May 30, 2018
9e9df92
Change helper method to output IDs, for use with existing method.
kienstra May 30, 2018
fd50993
Change method name in test, assign $post_ids.
kienstra May 30, 2018
1324044
Query for the latest posts, instead of by ID.
kienstra May 30, 2018
3c4db61
Change helper method to get post permalinks.
kienstra May 30, 2018
bbfafa4
Add a method to get links for public terms.
kienstra May 30, 2018
db54a53
Also get links for post_format terms.
kienstra May 30, 2018
6713ff3
Begin to apply review suggestion for get_taxonomy_links().
kienstra May 30, 2018
2655310
Add an $offset parameter to get_taxonomy_links().
kienstra May 30, 2018
32ab635
Add (optional) to the @param tag for $offset.
kienstra May 30, 2018
1a862c7
Pass 'fields' => 'ids' to WP_Query().
kienstra May 31, 2018
c250bd5
Allow paging through posts in get_post_permalinks().
kienstra May 31, 2018
71f7543
Add a method to crawl the entire site.
kienstra May 31, 2018
4aade51
Address Travis error by using full class name, not self::
kienstra May 31, 2018
584ffea
Force theme support of 'amp' on validation requests.
kienstra May 31, 2018
ae4371c
Add an 'amp' query arg if it's in Paired Mode.
kienstra May 31, 2018
7b67efe
Begin WP-CLI script to crawl the site.
kienstra Jun 1, 2018
15ec0ed
Display the number of validation issues in the sucess message:
kienstra Jun 1, 2018
1ef10a7
Site crawling script: output 'more detail' link.
kienstra Jun 1, 2018
ba08f3d
Update PR for latest changes to validator, merge develop.
kienstra Jul 12, 2018
bfe37a8
Register a WP-CLI command: wp amp validate-site
kienstra Jul 12, 2018
3d9a01c
Call the WP-CLI tick() method for every URL validated.
kienstra Jul 12, 2018
c586c24
Remove 'This might take a while...'
kienstra Jul 12, 2018
128ca9c
Only report unaccepted validation errors.
kienstra Jul 12, 2018
441044f
Remove the return value of validate_entire_site_urls().
kienstra Jul 12, 2018
875d4ee
Address PHPCS error by aligning =
kienstra Jul 12, 2018
4e8c00c
Remove function that forced AMP theme support.
kienstra Jul 12, 2018
c4b8f5f
Go back to displaying total errors,
kienstra Jul 12, 2018
c28e5ee
Remove wrapping in add_query_arg()
kienstra Jul 12, 2018
b6393cc
Fix the progress bar, by making counts more accurate.
kienstra Jul 12, 2018
c638e7e
Align = to address failed Travis build.
kienstra Jul 12, 2018
ad43100
Remove empty lines to address Travis issue.
kienstra Jul 12, 2018
5af8dc9
Output a count of unaccepted errors.
kienstra Jul 12, 2018
adcb901
Stop storing the validated URLs in a property.
kienstra Jul 13, 2018
9d11439
Make count_posts_and_terms() private.
kienstra Jul 13, 2018
5e19020
Improve PHP DocBlocks.
kienstra Aug 2, 2018
f4e6bfd
Address failed unit test.
kienstra Aug 2, 2018
8e1b031
Make function static, and other documentation changes
kienstra Aug 3, 2018
a4c1f7a
Merge branch 'develop' into valid-site
kienstra Aug 3, 2018
ce2b044
Add to DocBlock, remove needless empty line.
kienstra Aug 3, 2018
cd1d09e
Remove 'posts_per_page' => -1 from WP_Query arguments
kienstra Aug 11, 2018
edfdd65
Address failed unit test
kienstra Aug 12, 2018
19f4feb
Address Travis error by aligning =.
kienstra Aug 12, 2018
e34ebda
Merge branch 'develop' into valid-site
kienstra Aug 12, 2018
57411c6
Remove extra conditional block to call tick()
kienstra Aug 14, 2018
6802175
Improve documentation, remove duplicated code.
kienstra Aug 14, 2018
a53bb45
Add a way to force crawling non-AMP-enabled URLs
kienstra Aug 15, 2018
d44cde9
Exclude taxonomy templates if the user has unchecked them
kienstra Aug 15, 2018
71f4fcb
Add a flag to force validation of the entire site (but not yet implem…
kienstra Aug 15, 2018
ff3a5c0
Allow crawling templates the user has unchecked in 'Supported Templates'
kienstra Aug 15, 2018
d378a37
If there are no AMP-enabled taxonomies, don't count them.
kienstra Aug 15, 2018
4f8b3e9
Add an argument --include to the WP-CLI command.
kienstra Aug 15, 2018
4ea25bf
Implement the --include argument for taxonomies and posts
kienstra Aug 16, 2018
dadab10
Validate author pages, including with include=is_author
kienstra Aug 16, 2018
c687ea0
Validate the search template
kienstra Aug 16, 2018
09fa083
If there are no URLs to crawl, call WP_CLI:error()
kienstra Aug 16, 2018
eadb61c
Merge branch 'develop' into valid-site
kienstra Aug 16, 2018
e8717fd
Update dev-lib: Install WP-CLI after installing WordPress on Travis CI
westonruter Aug 16, 2018
13a0cc7
Ensure WP-CLI is available for deploy script
westonruter Aug 16, 2018
f054f13
Update Node so that Object.values is available
westonruter Aug 16, 2018
be32dcf
Refactor get_author_page_urls() to use round-robin validation
kienstra Aug 16, 2018
ff8d365
Refactor validate_entire_site_url() to use round-robin validation
kienstra Aug 16, 2018
1387c5b
Apply the maximum URL property to count_urls_to_validate()
kienstra Aug 17, 2018
41cffdc
Add a --max-url-count argument
kienstra Aug 17, 2018
0202432
Account for the homepage in the --include argument
kienstra Aug 17, 2018
bff8fc7
Refactor validate_urls() to validate a single URL
kienstra Aug 17, 2018
5d0d464
Display the validity by template type, like category: 15/16
kienstra Aug 17, 2018
86cd224
Get the date template, and validate it
kienstra Aug 21, 2018
c31fa0e
Remove default value in get_taxonomy_links()
kienstra Aug 21, 2018
be8d552
Remove exclusion of attachments from tests, other documentations
kienstra Aug 21, 2018
3dac2fd
Test removing the array_merge() call (will probably revert)
kienstra Aug 21, 2018
fe8eab3
Revert "Test removing the array_merge() call (will probably revert)"
kienstra Aug 21, 2018
b741f01
Merge branch 'develop' into valid-site
kienstra Aug 21, 2018
4cf9e9d
Account for the date page in the initial count
kienstra Aug 21, 2018
907facd
If there are no matched templates from --include, output error.
kienstra Aug 21, 2018
52ff854
Merge branch 'develop' of https://github.com/Automattic/amp-wp into v…
westonruter Aug 23, 2018
ab9c2d9
Tidy static analysis complaints
westonruter Aug 24, 2018
1f8f40b
Rename AMP_Site_Validation to AMP_CLI; init even in Classic mode
westonruter Aug 24, 2018
aca16f4
Refactor AMP_CLI to use methods as subcommands
westonruter Aug 24, 2018
2637fb8
Ignore auto-sanitization when looking for unaccepted errors
westonruter Aug 24, 2018
da22efc
In Classic Mode, call WP_CLI::error() if the --force flag isn't present
kienstra Aug 24, 2018
bfadbe2
Allow validating the site in 'Classic' mode
kienstra Aug 24, 2018
30f7522
Fix phpdoc for WP-CLI
westonruter Aug 28, 2018
e1760aa
Prevent crawling search or date URLs if empty; use for loop
westonruter Aug 28, 2018
ad7e0e2
Show warning when validate_url call fails
westonruter Aug 28, 2018
7a52751
Fix classic mode site validation by forcing native mode
westonruter Aug 28, 2018
8e2d028
Re-use amp_validate query param for forcing AMP theme support in clas…
westonruter Aug 28, 2018
edcb23a
Unconditionally initialize validation manager w/ registered post type…
westonruter Aug 29, 2018
0e9bf1b
Switch to query posts in descending order for improved recency releva…
westonruter Aug 29, 2018
a83a569
Simplify should_show_in_menu return condition
westonruter Aug 29, 2018
19ead16
Fix grammar typo in CLI success message
westonruter Aug 29, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions includes/class-amp-autoloader.php
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,7 @@ class AMP_Autoloader {
'AMP_Validation_Manager' => 'includes/validation/class-amp-validation-manager',
'AMP_Invalid_URL_Post_Type' => 'includes/validation/class-amp-invalid-url-post-type',
'AMP_Validation_Error_Taxonomy' => 'includes/validation/class-amp-validation-error-taxonomy',
'AMP_Site_Validation' => 'includes/validation/class-amp-site-validation',
'AMP_String_Utils' => 'includes/utils/class-amp-string-utils',
'AMP_WP_Utils' => 'includes/utils/class-amp-wp-utils',
'AMP_Widget_Archives' => 'includes/widgets/class-amp-widget-archives',
Expand Down
76 changes: 76 additions & 0 deletions includes/validation/class-amp-site-validation.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
<?php
/**
* Class AMP_Site_Validation
*
* @package AMP
*/

/**
* Class AMP_Site_Validation
*
* @since 1.0
*/
class AMP_Site_Validation {

/**
* Gets the permalinks of all public post types with the status 'publish,' to check for AMP validity.
*
* This excludes 'attachment' post types, as it only searches for posts with the status 'publish.'
* Attachments have a default status of 'inherit,' so they can depend on the status of their parent like a post.
*
* @todo: consider whether this should also return 'attachment' IDs.
* @param int $number_posts The maximum amount of posts to get the permalinks for (optional).
* @return string[] $permalinks The post permalinks in an array.
*/
public static function get_post_permalinks( $number_posts = 200 ) {
$post_types = get_post_types( array( 'public' => true ), 'names' );
$query = new WP_Query( array(
'posts_per_page' => $number_posts,
'post_type' => array_values( $post_types ),
'post_status' => 'publish',
) );

$post_ids = wp_list_pluck( $query->posts, 'ID' );
return array_map( 'get_permalink', $post_ids );
}

/**
* Gets the front-end links for public taxonomy terms, like categories and tags.
*
* For example, https://example.org/?cat=2
* This includes categories and tags, and any more that are registered.
* But it excludes post_format terms.
*
* @param int $number_links The maximum amount of links to get (optional).
* @return string[] $links The term links in an array.
*/
public static function get_taxonomy_links( $number_links = 200 ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be pagination support here. The WP_Term_Query supports an $offset arg. I suggest that this method only return links for one specific taxonomy. The site crawler will probably need to make multiple calls to such a method to iterate over all of the taxonomy terms in batches. One way to do it would be to get the terms sorted by ID in ascending order. That ensures that when you paginate through the terms you won't miss terms that get added while you're processing through them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, good ideas. These commits apply your suggestions, as I understand them. This now has $taxonomy and $offset parameters.

$public_taxonomies = get_taxonomies( array(
'public' => true,
) );

// It doesn't seem necessary to get links for post format terms, like asides, galleries, or quotes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think it is necessary. There are archive links for post formats: https://codex.wordpress.org/Function_Reference/get_post_format_link

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, commit db54a removes the exemption of post_format terms.

unset( $public_taxonomies['post_format'] );
$terms = get_terms( array(
'taxonomy' => $public_taxonomies,
'count' => $number_links,
) );

return array_map( 'get_term_link', $terms );
}

/**
* Validates the URLs on the site.
*
* @todo: Consider wrapping this function with another, as different use cases will probably require a different return value or display.
* For example, the <button> in /wp-admin that makes an AJAX request for this will need a different response than a WP-CLI command.
* @return array $validation_result The post ID as the index, and the result of validation as the value.
*/
public static function validate_site_urls() {
$site_post_ids = self::get_post_ids();
foreach ( $site_post_ids as $id ) {
AMP_Validation_Manager::$posts_pending_frontend_validation[ $id ] = true;
}
return AMP_Validation_Manager::validate_queued_posts_on_frontend();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you suggested, this validate_queued_posts_on_frontend method is probably not relevant. The main thing will be to:

  1. Obtain the list of URLs on the site, including homepage, posts, pages, categories, archives, etc. A good sitemap XML generator plugin should have the logic you need for this.
  2. If in paired mode, append ?amp to each of the URLs discovered. This will be used instead of amp_get_permalink() because the endpoint will be eliminated when AMP theme support is present. See Discontinue use of amp endpoint in favor of query var when amp theme support is present #1148. Some consideration in paired mode may be needed for whether AMP is available or not for a given post.
  3. For each AMP URL, call \AMP_Validation_Manager::validate_url() and pass the results into \AMP_Invalid_URL_Post_Type::store_validation_errors().
  4. Chunk the results to prevent timing out. This is particularly key in WP-Cron and Ajax. In WP-CLI this isn't relevant, and a progress meter should be shown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your fast reply, @westonruter! These details really help.

}
}
134 changes: 134 additions & 0 deletions tests/test-class-amp-site-validation.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
<?php
/**
* Tests for AMP_Site_Validation class.
*
* @package AMP
*/

/**
* Tests for AMP_Site_Validation class.
*
* @since 1.0
*/
class Test_AMP_Site_Validation extends \WP_UnitTestCase {

/**
* The name of the tested class.
*
* @var string
*/
const TESTED_CLASS = 'AMP_Site_Validation';

/**
* The name of the tag to test.
*
* @var string
*/
const TAG_NAME = 'img';

/**
* A disallowed element, which should cause a validation error.
*
* @var string
*/
const DISALLOWED_TAG = '<script>doThis();</script>';

/**
* An instance of DOMElement to test.
*
* @var DOMElement
*/
public $node;

/**
* Setup.
*
* @inheritdoc
* @global $wp_registered_widgets
*/
public function setUp() {
parent::setUp();
$dom_document = new DOMDocument( '1.0', 'utf-8' );
$this->node = $dom_document->createElement( self::TAG_NAME );
AMP_Validation_Manager::reset_validation_results();
}

/**
* Test get_post_permalinks.
*
* @covers AMP_Site_Validation::get_post_permalinks()
*/
public function test_get_post_permalinks() {
$number_posts_each_post_type = 20;
$post_types = get_post_types( array( 'public' => true ), 'names' );
$expected_post_permalinks = array();

/**
* The tested method does not get attachment permalinks.
* It only searches for posts with the status 'publish.'
* Attachments have a default status of 'inherit,' to depend on the status of their parent post.
*/
unset( $post_types['attachment'] );
foreach ( $post_types as $post_type ) {
for ( $i = 0; $i < $number_posts_each_post_type; $i++ ) {
$expected_post_permalinks[] = get_permalink( $this->factory()->post->create( array(
'post_type' => $post_type,
) ) );
}
}
$number_of_posts = count( $post_types ) * $number_posts_each_post_type;
$actual_post_permalinks = AMP_Site_Validation::get_post_permalinks( $number_of_posts );

/*
* The factory() method above creates posts so quickly that the WP_Query() default argument of 'orderby' => 'date'
* doesn't return them in the exact order they were created.
* So this simply ensures all of the created $post_ids are present in the return value of the tested method.
*/
$this->assertEquals( 0, count( array_diff( $expected_post_permalinks, $actual_post_permalinks ) ) );
$this->assertEquals( count( $expected_post_permalinks ), count( $actual_post_permalinks ) );
}

/**
* Test get_taxonomy_links.
*
* @covers AMP_Site_Validation::get_taxonomy_links()
*/
public function test_get_taxonomy_links() {
$number_links_each_taxonomy = 20;
$taxonomies = get_taxonomies( array(
'public' => true,
) );

unset( $taxonomies['post_format'] );
$all_terms = array();

foreach ( $taxonomies as $taxonomy ) {
$terms_for_current_taxonomy = array();
for ( $i = 0; $i < $number_links_each_taxonomy; $i++ ) {
$terms_for_current_taxonomy[] = $this->factory()->term->create( array(
'taxonomy' => $taxonomy,
) );
}

// Terms need to be associated with a post in order to be returned in get_terms().
wp_set_post_terms(
$this->factory()->post->create(),
$terms_for_current_taxonomy,
$taxonomy
);
$all_terms = array_merge( $all_terms, $terms_for_current_taxonomy );
}

$expected_links = array_map( 'get_term_link', $all_terms );
$number_of_links = 100;
$actual_links = AMP_Site_Validation::get_taxonomy_links( $number_of_links );

/*
* Test that all of the $expected_links are present.
* There is already one term present before this test method adds any,
* so that can also appear in the returned $actual_links.
*/
$this->assertEquals( 0, count( array_diff( $expected_links, $actual_links ) ) );
$this->assertLessThan( $number_of_links, count( $actual_links ) );
}
}