Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add encoding detection (and pin export to UTF-8) #8506

Merged
merged 35 commits into from
Mar 9, 2022
Merged
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
955ceea
Pin encoding to UTF-8
koppor Jan 13, 2022
84d3a4d
Add more test cases
koppor Jan 13, 2022
e32bdf3
Compile fix
koppor Jan 13, 2022
7fdd769
Compilefixes
koppor Jan 13, 2022
e6abb20
Merge branch 'main' into fix-changed
koppor Feb 18, 2022
507e736
Fix position of CHANGELOG.md entries
koppor Feb 18, 2022
a8d2f57
Fix CHANGELOG.md
koppor Feb 18, 2022
30b9b51
Fix file path
koppor Feb 18, 2022
1e754f7
Fix visilibity of loggers
koppor Feb 19, 2022
592c8a2
Log I/O exception at charset detection
koppor Feb 19, 2022
f54c91c
Add charset detection
koppor Feb 19, 2022
e6e2954
Add more LOGGER statements
koppor Feb 19, 2022
16ada78
Really remove "default encoding" preference
koppor Feb 19, 2022
49b3ded
Merge branch 'main' into fix-changed
koppor Feb 19, 2022
851ab12
Add more test cases
koppor Feb 19, 2022
eb1f15c
Remove hint on gradle wrapper update (as we have a GitHub action for …
koppor Feb 19, 2022
28b41c3
Fix checkstyle
koppor Feb 19, 2022
11a9749
fix missing import
Siedlerchr Feb 19, 2022
932cf60
refactor tests, compare optionals
Siedlerchr Feb 19, 2022
fef3b5e
Merge branch 'main' into fix-changed
koppor Feb 20, 2022
f082486
Use system's default charset as default
koppor Feb 20, 2022
b4656d1
Fix merge error
koppor Feb 20, 2022
7471b5f
Fix parameter
koppor Feb 21, 2022
8e753ec
Merge remote-tracking branch 'origin/main' into fix-changed
koppor Feb 24, 2022
5bd4ba4
Merge remote-tracking branch 'upstream/main' into fix-changed
Siedlerchr Feb 26, 2022
8872ca4
Fix l10
Siedlerchr Feb 26, 2022
bb7473c
Fix charset detection
Siedlerchr Feb 26, 2022
e3feb6a
Merge remote-tracking branch 'upstream/main' into fix-changed
Siedlerchr Feb 26, 2022
a13c98f
remove reverted commit from changelog
Siedlerchr Feb 26, 2022
40c20bf
Disable charset detection for pdfs
Siedlerchr Feb 26, 2022
ce39f25
return right charset for utf16
Siedlerchr Feb 26, 2022
a90e519
YAML supports UTF-8 by default
Siedlerchr Feb 26, 2022
3394dbd
remove useless test
Siedlerchr Feb 26, 2022
ef1fd5c
Merge remote-tracking branch 'upstream/main' into fix-changed
Siedlerchr Mar 9, 2022
75d2f7d
Fix checkstyle
Siedlerchr Mar 9, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,16 +18,21 @@ Note that this project **does not** adhere to [Semantic Versioning](http://semve
- The CSL preview styles now also support displaying data from cross references entries that are linked via the `crossref` field [#7378](https://github.com/JabRef/jabref/issues/7378)
- We made the Search button in Web Search wider. We also skewed the panel titles to the left [#8397](https://github.com/JabRef/jabref/issues/8397)
- We introduced a preference to disable fulltext indexing [#8468](https://github.com/JabRef/jabref/issues/8468)
- When exporting entries, the encoding is always UTF-8
- When embedding BibTeX data into a PDF, the encoding is always UTF-8

### Fixed

- We fixed an issue where an exception could occur when saving the preferences [#7614](https://github.com/JabRef/jabref/issues/7614)
- We fixed an issue where "Copy DOI url" in the right-click menu of the Entry List would just copy the DOI and not the DOI url. [#8389](https://github.com/JabRef/jabref/issues/8389)
- We fixed an issue where opening the console from the drop-down menu would cause an exception. [#8466](https://github.com/JabRef/jabref/issues/8466)
- We fixed an issue when reading non-UTF-8 encoded. When no encoding header is present, the encoding is now detected from the file content (and the preference option is disregarded) [#8417](https://github.com/JabRef/jabref/issues/8417)
- We fixed an issue where modifying a library would trigger reindexing of all PDFs [#8420](https://github.com/JabRef/jabref/issues/8420)

### Removed

- We removed the option to copy CSL Citation styles data as `XSL_FO`, `ASCIIDOC`, and `RTF` as these have not been working since a long time and are no longer supported in the external library used for processing the styles [#7378](https://github.com/JabRef/jabref/issues/7378)
- We removed the option to configure the default encoding. The default encoding is now hard-coded to the modern UTF-8 encoding.



Expand Down
5 changes: 1 addition & 4 deletions build.gradle
Original file line number Diff line number Diff line change
@@ -1,12 +1,8 @@
import groovy.json.JsonSlurper
import org.gradle.internal.os.OperatingSystem
import org.jabref.build.JournalAbbreviationConverter
import org.jabref.build.xjc.XjcPlugin
import org.jabref.build.xjc.XjcTask

// to update the gradle wrapper, execute
// ./gradlew wrapper --gradle-version=6.0 --distribution-type=bin

plugins {
id 'application'

Expand Down Expand Up @@ -122,6 +118,7 @@ dependencies {
implementation 'com.h2database:h2-mvstore:2.1.210'

implementation group: 'org.apache.tika', name: 'tika-core', version: '2.3.0'
implementation 'com.ibm.icu:icu4j-charset:70.1'

// required for reading write-protected PDFs - see https://github.com/JabRef/jabref/pull/942#issuecomment-209252635
implementation 'org.bouncycastle:bcprov-jdk15on:1.70'
Expand Down
1 change: 1 addition & 0 deletions src/main/java/module-info.java
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,7 @@
requires org.antlr.antlr4.runtime;
requires org.fxmisc.flowless;
requires org.apache.tika.core;
requires com.ibm.icu;

requires flexmark;
requires flexmark.ext.gfm.strikethrough;
Expand Down
41 changes: 17 additions & 24 deletions src/main/java/org/jabref/cli/ArgumentProcessor.java
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
package org.jabref.cli;

import java.io.IOException;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Path;
import java.util.ArrayList;
Expand Down Expand Up @@ -236,7 +236,6 @@ private List<ParserResult> processArguments() {
if (!loaded.isEmpty()) {
writeMetadatatoPdf(loaded,
cli.getWriteMetadatatoPdf(),
preferencesService.getGeneralPreferences().getDefaultEncoding(),
preferencesService.getXmpPreferences(),
preferencesService.getFilePreferences(),
preferencesService.getGeneralPreferences().getDefaultBibDatabaseMode(),
Expand Down Expand Up @@ -271,7 +270,7 @@ private List<ParserResult> processArguments() {
return loaded;
}

private void writeMetadatatoPdf(List<ParserResult> loaded, String filesAndCitekeys, Charset encoding, XmpPreferences xmpPreferences, FilePreferences filePreferences, BibDatabaseMode databaseMode, BibEntryTypesManager entryTypesManager, FieldWriterPreferences fieldWriterPreferences, boolean writeXMP, boolean embeddBibfile) {
private void writeMetadatatoPdf(List<ParserResult> loaded, String filesAndCitekeys, XmpPreferences xmpPreferences, FilePreferences filePreferences, BibDatabaseMode databaseMode, BibEntryTypesManager entryTypesManager, FieldWriterPreferences fieldWriterPreferences, boolean writeXMP, boolean embeddBibfile) {
if (loaded.isEmpty()) {
LOGGER.error("The write xmp option depends on a valid import option.");
return;
Expand All @@ -285,7 +284,7 @@ private void writeMetadatatoPdf(List<ParserResult> loaded, String filesAndCiteke

if ("all".equals(filesAndCitekeys)) {
for (BibEntry entry : dataBase.getEntries()) {
writeMetadatatoPDFsOfEntry(databaseContext, entry.getCitationKey().orElse("<no cite key defined>"), entry, encoding, filePreferences, xmpPdfExporter, embeddedBibFilePdfExporter, writeXMP, embeddBibfile);
writeMetadatatoPDFsOfEntry(databaseContext, entry.getCitationKey().orElse("<no cite key defined>"), entry, filePreferences, xmpPdfExporter, embeddedBibFilePdfExporter, writeXMP, embeddBibfile);
}
return;
}
Expand All @@ -300,22 +299,22 @@ private void writeMetadatatoPdf(List<ParserResult> loaded, String filesAndCiteke
}
}

writeMetadatatoPdfByCitekey(databaseContext, dataBase, citeKeys, encoding, filePreferences, xmpPdfExporter, embeddedBibFilePdfExporter, writeXMP, embeddBibfile);
writeMetadatatoPdfByFileNames(databaseContext, dataBase, pdfs, encoding, filePreferences, xmpPdfExporter, embeddedBibFilePdfExporter, writeXMP, embeddBibfile);
writeMetadatatoPdfByCitekey(databaseContext, dataBase, citeKeys, filePreferences, xmpPdfExporter, embeddedBibFilePdfExporter, writeXMP, embeddBibfile);
writeMetadatatoPdfByFileNames(databaseContext, dataBase, pdfs, filePreferences, xmpPdfExporter, embeddedBibFilePdfExporter, writeXMP, embeddBibfile);

}

private void writeMetadatatoPDFsOfEntry(BibDatabaseContext databaseContext, String citeKey, BibEntry entry, Charset encoding, FilePreferences filePreferences, XmpPdfExporter xmpPdfExporter, EmbeddedBibFilePdfExporter embeddedBibFilePdfExporter, boolean writeXMP, boolean embeddBibfile) {
private void writeMetadatatoPDFsOfEntry(BibDatabaseContext databaseContext, String citeKey, BibEntry entry, FilePreferences filePreferences, XmpPdfExporter xmpPdfExporter, EmbeddedBibFilePdfExporter embeddedBibFilePdfExporter, boolean writeXMP, boolean embeddBibfile) {
try {
if (writeXMP) {
if (xmpPdfExporter.exportToAllFilesOfEntry(databaseContext, encoding, filePreferences, entry, List.of(entry))) {
if (xmpPdfExporter.exportToAllFilesOfEntry(databaseContext, filePreferences, entry, List.of(entry))) {
System.out.printf("Successfully written XMP metadata on at least one linked file of %s%n", citeKey);
} else {
System.err.printf("Cannot write XMP metadata on any linked files of %s. Make sure there is at least one linked file and the path is correct.%n", citeKey);
}
}
if (embeddBibfile) {
if (embeddedBibFilePdfExporter.exportToAllFilesOfEntry(databaseContext, encoding, filePreferences, entry, List.of(entry))) {
if (embeddedBibFilePdfExporter.exportToAllFilesOfEntry(databaseContext, filePreferences, entry, List.of(entry))) {
System.out.printf("Successfully embedded metadata on at least one linked file of %s%n", citeKey);
} else {
System.out.printf("Cannot embedd metadata on any linked files of %s. Make sure there is at least one linked file and the path is correct.%n", citeKey);
Expand All @@ -326,20 +325,20 @@ private void writeMetadatatoPDFsOfEntry(BibDatabaseContext databaseContext, Stri
}
}

private void writeMetadatatoPdfByCitekey(BibDatabaseContext databaseContext, BibDatabase dataBase, Vector<String> citeKeys, Charset encoding, FilePreferences filePreferences, XmpPdfExporter xmpPdfExporter, EmbeddedBibFilePdfExporter embeddedBibFilePdfExporter, boolean writeXMP, boolean embeddBibfile) {
private void writeMetadatatoPdfByCitekey(BibDatabaseContext databaseContext, BibDatabase dataBase, Vector<String> citeKeys, FilePreferences filePreferences, XmpPdfExporter xmpPdfExporter, EmbeddedBibFilePdfExporter embeddedBibFilePdfExporter, boolean writeXMP, boolean embeddBibfile) {
for (String citeKey : citeKeys) {
List<BibEntry> bibEntryList = dataBase.getEntriesByCitationKey(citeKey);
if (bibEntryList.isEmpty()) {
System.err.printf("Skipped - Cannot find %s in library.%n", citeKey);
continue;
}
for (BibEntry entry : bibEntryList) {
writeMetadatatoPDFsOfEntry(databaseContext, citeKey, entry, encoding, filePreferences, xmpPdfExporter, embeddedBibFilePdfExporter, writeXMP, embeddBibfile);
writeMetadatatoPDFsOfEntry(databaseContext, citeKey, entry, filePreferences, xmpPdfExporter, embeddedBibFilePdfExporter, writeXMP, embeddBibfile);
}
}
}

private void writeMetadatatoPdfByFileNames(BibDatabaseContext databaseContext, BibDatabase dataBase, Vector<String> fileNames, Charset encoding, FilePreferences filePreferences, XmpPdfExporter xmpPdfExporter, EmbeddedBibFilePdfExporter embeddedBibFilePdfExporter, boolean writeXMP, boolean embeddBibfile) {
private void writeMetadatatoPdfByFileNames(BibDatabaseContext databaseContext, BibDatabase dataBase, Vector<String> fileNames, FilePreferences filePreferences, XmpPdfExporter xmpPdfExporter, EmbeddedBibFilePdfExporter embeddedBibFilePdfExporter, boolean writeXMP, boolean embeddBibfile) {
for (String fileName : fileNames) {
Path filePath = Path.of(fileName);
if (!filePath.isAbsolute()) {
Expand All @@ -348,14 +347,14 @@ private void writeMetadatatoPdfByFileNames(BibDatabaseContext databaseContext, B
if (Files.exists(filePath)) {
try {
if (writeXMP) {
if (xmpPdfExporter.exportToFileByPath(databaseContext, dataBase, encoding, filePreferences, filePath)) {
if (xmpPdfExporter.exportToFileByPath(databaseContext, dataBase, filePreferences, filePath)) {
System.out.printf("Successfully written XMP metadata of at least one entry to %s%n", fileName);
} else {
System.out.printf("File %s is not linked to any entry in database.%n", fileName);
}
}
if (embeddBibfile) {
if (embeddedBibFilePdfExporter.exportToFileByPath(databaseContext, dataBase, encoding, filePreferences, filePath)) {
if (embeddedBibFilePdfExporter.exportToFileByPath(databaseContext, dataBase, filePreferences, filePath)) {
System.out.printf("Successfully embedded XMP metadata of at least one entry to %s%n", fileName);
} else {
System.out.printf("File %s is not linked to any entry in database.%n", fileName);
Expand Down Expand Up @@ -410,9 +409,7 @@ private boolean exportMatches(List<ParserResult> loaded) {
// We have an TemplateExporter instance:
try {
System.out.println(Localization.lang("Exporting") + ": " + data[1]);
exporter.get().export(databaseContext, Path.of(data[1]),
databaseContext.getMetaData().getEncoding().orElse(preferencesService.getGeneralPreferences().getDefaultEncoding()),
matches);
exporter.get().export(databaseContext, Path.of(data[1]), matches);
} catch (Exception ex) {
System.err.println(Localization.lang("Could not export file") + " '" + data[1] + "': "
+ Throwables.getStackTraceAsString(ex));
Expand Down Expand Up @@ -455,7 +452,6 @@ private List<ParserResult> importAndOpenFiles() {
try {
pr = OpenDatabase.loadDatabase(
Path.of(aLeftOver),
preferencesService.getGeneralPreferences(),
preferencesService.getImportFormatPreferences(),
Globals.getFileUpdateMonitor());
} catch (IOException ex) {
Expand Down Expand Up @@ -530,7 +526,7 @@ private void saveDatabase(BibDatabase newBase, String subName) {
System.out.println(Localization.lang("Saving") + ": " + subName);
GeneralPreferences generalPreferences = preferencesService.getGeneralPreferences();
SavePreferences savePreferences = preferencesService.getSavePreferences();
AtomicFileWriter fileWriter = new AtomicFileWriter(Path.of(subName), generalPreferences.getDefaultEncoding());
AtomicFileWriter fileWriter = new AtomicFileWriter(Path.of(subName), StandardCharsets.UTF_8);
BibWriter bibWriter = new BibWriter(fileWriter, OS.NEWLINE);
BibDatabaseWriter databaseWriter = new BibtexDatabaseWriter(bibWriter, generalPreferences, savePreferences, Globals.entryTypesManager);
databaseWriter.saveDatabase(new BibDatabaseContext(newBase));
Expand All @@ -539,9 +535,8 @@ private void saveDatabase(BibDatabase newBase, String subName) {
if (fileWriter.hasEncodingProblems()) {
System.err.println(Localization.lang("Warning") + ": "
+ Localization.lang(
"The chosen encoding '%0' could not encode the following characters:",
generalPreferences.getDefaultEncoding().displayName())
+ " " + fileWriter.getEncodingProblems());
"UTF-8 could not be used to encode the following characters:"
+ " " + fileWriter.getEncodingProblems()));
}
} catch (IOException ex) {
System.err.println(Localization.lang("Could not save file.") + "\n" + ex.getLocalizedMessage());
Expand Down Expand Up @@ -581,8 +576,6 @@ private void exportFile(List<ParserResult> loaded, String[] data) {
// We have an exporter:
try {
exporter.get().export(pr.getDatabaseContext(), Path.of(data[0]),
pr.getDatabaseContext().getMetaData().getEncoding()
.orElse(preferencesService.getGeneralPreferences().getDefaultEncoding()),
pr.getDatabaseContext().getDatabase().getEntries());
} catch (Exception ex) {
System.err.println(Localization.lang("Could not export file") + " '" + data[0] + "': "
Expand Down
2 changes: 1 addition & 1 deletion src/main/java/org/jabref/gui/JabRefFrame.java
Original file line number Diff line number Diff line change
Expand Up @@ -826,7 +826,7 @@ private MenuBar createMenu() {

new SeparatorMenuItem(),

factory.createMenuItem(StandardActions.WRITE_METADATA_TO_PDF, new WriteMetadataToPdfAction(stateManager, prefs.getGeneralPreferences().getDefaultBibDatabaseMode(), Globals.entryTypesManager, prefs.getFieldWriterPreferences(), dialogService, taskExecutor, prefs.getFilePreferences(), prefs.getXmpPreferences(), prefs.getGeneralPreferences().getDefaultEncoding())),
factory.createMenuItem(StandardActions.WRITE_METADATA_TO_PDF, new WriteMetadataToPdfAction(stateManager, prefs.getGeneralPreferences().getDefaultBibDatabaseMode(), Globals.entryTypesManager, prefs.getFieldWriterPreferences(), dialogService, taskExecutor, prefs.getFilePreferences(), prefs.getXmpPreferences())),
factory.createMenuItem(StandardActions.COPY_LINKED_FILES, new CopyFilesAction(dialogService, prefs, stateManager)),

new SeparatorMenuItem(),
Expand Down
1 change: 0 additions & 1 deletion src/main/java/org/jabref/gui/JabRefGUI.java
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,6 @@ private void openLastEditedDatabases() {
try {
parsedDatabase = OpenDatabase.loadDatabase(
dbFile,
preferencesService.getGeneralPreferences(),
preferencesService.getImportFormatPreferences(),
Globals.getFileUpdateMonitor());
} catch (IOException ex) {
Expand Down
2 changes: 1 addition & 1 deletion src/main/java/org/jabref/gui/collab/ChangeScanner.java
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ public List<DatabaseChangeViewModel> scanForChanges() {
// Important: apply all post-load actions
ImportFormatPreferences importFormatPreferences = preferencesService.getImportFormatPreferences();
GeneralPreferences generalPreferences = preferencesService.getGeneralPreferences();
ParserResult result = OpenDatabase.loadDatabase(database.getDatabasePath().get(), generalPreferences, importFormatPreferences, new DummyFileUpdateMonitor());
ParserResult result = OpenDatabase.loadDatabase(database.getDatabasePath().get(), importFormatPreferences, new DummyFileUpdateMonitor());
BibDatabaseContext databaseOnDisk = result.getDatabaseContext();

// Start looking at changes.
Expand Down
6 changes: 2 additions & 4 deletions src/main/java/org/jabref/gui/entryeditor/EntryEditor.java
Original file line number Diff line number Diff line change
Expand Up @@ -370,8 +370,7 @@ private void setupToolBar() {
preferencesService.getImporterPreferences(),
preferencesService.getImportFormatPreferences(),
preferencesService.getFilePreferences(),
databaseContext,
preferencesService.getGeneralPreferences().getDefaultEncoding());
databaseContext);
for (EntryBasedFetcher fetcher : entryBasedFetchers) {
MenuItem fetcherMenuItem = new MenuItem(fetcher.getName());
if (fetcher instanceof PdfMergeMetadataImporter.EntryBasedFetcherWrapper) {
Expand All @@ -383,8 +382,7 @@ private void setupToolBar() {
preferencesService.getImporterPreferences(),
preferencesService.getImportFormatPreferences(),
preferencesService.getFilePreferences(),
databaseContext,
preferencesService.getGeneralPreferences().getDefaultEncoding());
databaseContext);
fetchAndMerge(pdfMergeMetadataImporter);
});
} else {
Expand Down
4 changes: 0 additions & 4 deletions src/main/java/org/jabref/gui/exporter/ExportCommand.java
Original file line number Diff line number Diff line change
Expand Up @@ -118,10 +118,6 @@ private void export(Path file, FileChooser.ExtensionFilter selectedExtensionFilt
.wrap(() -> {
format.export(stateManager.getActiveDatabase().get(),
file,
stateManager.getActiveDatabase().get()
.getMetaData()
.getEncoding()
.orElse(preferences.getGeneralPreferences().getDefaultEncoding()),
finEntries);
return null; // can not use BackgroundTask.wrap(Runnable) because Runnable.run() can't throw Exceptions
})
Expand Down
Loading