Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Xamarin.Android.Build.Tasks] Allow BuildApk to run incrementally. #3999

Merged
merged 2 commits into from
Jan 30, 2020

Conversation

dellis1972
Copy link
Contributor

The BuildApk task currently does not work incrementally.
When it is called it throws away the previously built apk
in favour of using the base on that aapt/aapt2 just built.
With incremental builds it is highly likely that only a
few files have actually changed so we are wasting a ton of time.

During my tests on a simple app BuildApk generally takes
about 4 seconds to run. This is the same for both clean
and incremental builds.
With these changes, this time could now go down to 600 ms
for an incremental build. Which is a decent improvement.

The new system does not throw away all the work done in
the previous build. Instead it uses the previous output
apk as a base on which to produce the new one.
The output from aapt/aapt2 is instead merged into the
current apk. We do this by checking the CRC values of
each entry and only merging in files which have actually
changed. We need to use the CRC here because aapt/aapt2
is NEVER setting the ModifiedDate on the zip items :(.
It seems to always be 01/01/1980.

Once those changes are out of the way , we can now move on
to our custom content. This time we can use the ModifiedDate
of both the ZipEntry AND the source file. This allows us
to, again, only update the items which have accutally changed.

One final part is we need to keep track of what files are in the
apk in the first place and which ones get added to it. This is
what the existingEntries List is used for. This allows us to
detect when files are REMOVED from the system. Otherwise we end up
with stale files in the apk from previous builds.

On my tests when changing a single AndroidResource file I got
the following results

3664 ms BuildApk                  1 calls (Before)
669 ms BuildApk                  1 calls (After)

Which is a decent improvement.

@dellis1972
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@jonathanpeppers jonathanpeppers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since Dean is out for the holidays, I'll look into these.

src/Xamarin.Android.Build.Tasks/Tasks/BuildApk.cs Outdated Show resolved Hide resolved
src/Xamarin.Android.Build.Tasks/Tasks/BuildApk.cs Outdated Show resolved Hide resolved
@jonathanpeppers
Copy link
Member

@dellis1972 if you happen to stop by, I'm getting something weird on Windows:

image

In this unit test, the zip entry has a ModificationTime older than the file on disk?

Could the code producing these timestamps have a problem?

@dellis1972
Copy link
Contributor Author

@dellis1972 if you happen to stop by, I'm getting something weird on Windows:

image

In this unit test, the zip entry has a ModificationTime older than the file on disk?

Could the code producing these timestamps have a problem?

The timestamps in the zip are from the File afaik, and libzip handles all of that.

@jonathanpeppers
Copy link
Member

In this unit test, the zip entry has a ModificationTime older than the file on disk?

The issue was that the zip file doesn't include milliseconds, but on Windows you get millisecond precision from file timestamps.

@dellis1972 dellis1972 force-pushed the buildapkinc branch 3 times, most recently from 1f2f47d to c960fd5 Compare January 10, 2020 17:36
@dellis1972
Copy link
Contributor Author

Getting weird test results with the SkipExistingFile_Exist and SkipExistingFile_TimeStamp tests.

The A.txt file is written to disk before the zip is created...

BUT when we examine the date times later.. the file on disk has a NEWER time???? By almost a second. Nothing touches this file between it being created, added to the zip and then being read to get the TimeStamp.

14/01/2020 15:49:59 (disk 01/14/2020 15:49:59:688) <= 14/01/2020 15:49:58 (zip 01/14/2020 15:49:58:000) 

@dellis1972
Copy link
Contributor Author

We have a weird problem where the Modified Date of the entry in a zip is somehow OLDER than the source file it is created from. Its consistently 1 second older, but its not 100% of the time. It seems like some sort of rounding error 🤷‍♂

It turns out this is a limitation of the Zip file format. If we read the Structure document [1] we come across this sentence.

The FAT filesystem of DOS has a timestamp resolution of only two seconds; ZIP file records mimic this. As a result, the built-in timestamp resolution of files in a ZIP archive is only two seconds, though extra fields can be used to store more precise timestamps.
So a modified date in a zip file can be one of TWO seconds. We can confirm this issue by looking at the time encoding functions used within lib zip itself. The Encoding code makes used of bit shifting to encode the unix timestamp into the required Zip format. The Decoding code reverses this operation. So lets run through a few examples.

So given a time of 12:46 and 29 seconds if we run through the code we get a value of 26062 to store in the zip file local header. If we use 12:46 and 30 seconds we get 26063. So far so go. Now let's plug these encoded values into our decoding code (x << 1) & 62.

(26062 << 1) & 62 = 28
(26063 << 1) & 62 = 30

As you can see while the 30 second timestamp decoded correctly, the 29 second one did not. So when we convert these values back to DateTime values we can end up with a timestamp that is 1 second in the past as it always rounds down. This is not great when we want to try to compare timestamps with files created on disk. Say a file was created at 29 seconds and we create the zip at the same time (within a second), according to the zip it would have been created at 28 seconds. So if we use the timestamps to compare it will thing the file is NEWER. Because this only happens if the zip and the file are created within the same 1 second timespan, this is not always a problem.

The work around in this case is to detect when we get these collisions and bump UP to the next second when we are creating the zip file.

But to be honest, I'm not sure there will ever be a good workaround because of the format of the zip file.

[1] https://en.wikipedia.org/wiki/Zip_(file_format)#Structure

string root = Path.GetDirectoryName (GetType ().Assembly.Location);
temp = Path.Combine (root, "temp", TestContext.CurrentContext.Test.Name);
root = Path.GetDirectoryName (GetType ().Assembly.Location);
string temp = Path.Combine (root, "temp", TestContext.CurrentContext.Test.Name);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests were running in Parallel, and because some used the same root folder they tripped up over each other. So I had to rework this a bit so it would work in a parallel environment.

@@ -272,7 +272,7 @@ public virtual void UpdateProjectFiles (string directory, IEnumerable<ProjectRes
continue;
}
string filedir = directory;
if (path.Contains (Path.DirectorySeparatorChar)) {
if (path.Contains ($"{Path.DirectorySeparatorChar}")) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonathanpeppers do we prefer this or .ToString ?

Copy link
Member

@jonathanpeppers jonathanpeppers Jan 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both look fine, so I looked at the IL produced:

// if (text.Contains($"{Path.DirectorySeparatorChar}"))
IL_00bb: ldloc.2
IL_00bc: ldstr "{0}"
IL_00c1: ldsfld char [netstandard]System.IO.Path::DirectorySeparatorChar
IL_00c6: box [netstandard]System.Char
IL_00cb: call string [netstandard]System.String::Format(string, object)
IL_00d0: callvirt instance bool [netstandard]System.String::Contains(string)

So I think it only matters if perf is a concern, ToString() would be better and avoid a string.Format("{0}", Path.DirectorySeparatorChar). It also appears to box!

This PR is green and it's a test, so I wouldn't change it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its good to know that ToString is more performant :D

@dellis1972
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dellis1972
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

The `BuildApk` task currently does not work incrementally.
When it is called it throws away the previously built apk
in favour of using the `base` on that `aapt/aapt2` just built.
With incremental builds it is highly likely that only a
few files have actually changed so we are wasting a ton of time.

During my tests on a simple app `BuildApk` generally takes
about 4 seconds to run. This is the same for both clean
and incremental builds.
With these changes, this time could now go down to 600 ms
for an incremental build. Which is a decent improvement.

The new system does not throw away all the work done in
the previous build. Instead it uses the previous output
apk as a base on which to produce the new one.
The output from `aapt/aapt2` is instead `merged` into the
current apk. We do this by checking the `CRC` values of
each entry and only merging in files which have actually
changed. We need to use the `CRC` here because `aapt/aapt2`
is NEVER setting the `ModifiedDate` on the zip items :(.
It seems to always be 01/01/1980.

Once those changes are out of the way , we can now move on
to our custom content. This time we can use the `ModifiedDate`
of both the `ZipEntry` AND the source file. This allows us
to, again, only update the items which have accutally changed.

One final part is we need to keep track of what files are in the
`apk` in the first place and which ones get added to it. This is
what the `existingEntries` List is used for. This allows us to
detect when files are REMOVED from the system. Otherwise we end up
with stale files in the apk from previous builds.

On my tests when changing a single AndroidResource file I got
the following results

	3664 ms BuildApk                  1 calls (Before)
	669 ms BuildApk                  1 calls (After)

Which is a decent improvement.
@dellis1972
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@dellis1972 dellis1972 merged commit 1c4727c into dotnet:master Jan 30, 2020
@dellis1972 dellis1972 deleted the buildapkinc branch January 30, 2020 09:35
@github-actions github-actions bot locked and limited conversation to collaborators Jan 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants