Cannot access images included in the content pasted from Microsoft Word
https://bugs.webkit.org/show_bug.cgi?id=124391
<rdar://problem/26862741>
Reviewed by Antti Koivisto.
Source/WebCore:
The bug is caused by the fact Microsoft Word generates HTML content which references an image using file URL.
Because the websites don't have access to arbtirary file URLs, this prevents editors such as TinyMCE to save
those images.
This patch fixes the problem by converting file URLs for images and all other subresources in the web archive
generated by Microsoft Word by blob URLs like r222839 for RTF/RTFD and r222119 for images.
To avoid revealing privacy sensitive information such as the absolute local file path to the user's home directory
Microsoft Word and other applications in the system includes in the web archive placed in the system pasteboard,
this patch also introduces the mechanism to sanitize when the HTML content is read by DataTransfer's getData.
This patch also introduces the sanitization for when writing HTML into the pasteboard since other applications
in the syste which is capable to processing web archives are not necessarily equipped to pretect itself and the
rest of the system from potentially dangerous JavaScript included in the web archive placed in the system pasteboard.
Finally, this patch expands the list of clipboard types that are exposed as "text/html" to the Web platform by
adding the capability to convert RTF, RTFD, and web archive into HTML markup by introducing WebContentMarkupReader,
a new subclass of PasteboardWebContentReader which creates a HTML markup instead of a document fragment. Most of
the sanitization process happens in this new class, and will be expanded to WebContentReader to make pasting safer.
Tests: editing/pasteboard/data-transfer-get-data-on-pasting-html-uses-blob-url.html
editing/pasteboard/data-transfer-set-data-sanitizes-html-when-copying-in-null-origin.html
editing/pasteboard/data-transfer-set-data-sanitizes-html-when-copying.html
editing/pasteboard/data-transfer-set-data-sanitlize-html-when-dragging-in-null-origin.html
http/tests/security/clipboard/copy-paste-html-across-origin-sanitizes-html.html
CopyHTML.Sanitizes
DataInteractionTests.DataTransferSanitizeHTML
PasteRTF.ExposesHTMLTypeInDataTransfer
PasteRTFD.ExposesHTMLTypeInDataTransfer
PasteRTFD.ImageElementUsesBlobURLInHTML
PasteWebArchive.ExposesHTMLTypeInDataTransfer
* dom/DataTransfer.cpp:
(WebCore::originIdentifierForDocument): Moved to Document::originIdentifierForPasteboard.
(WebCore::DataTransfer::createForCopyAndPaste):
(WebCore::DataTransfer::getDataForItem const): Use WebContentMarkupReader read HTMl content so that we can read
web arhive, RTF, and RTFD as text/html.
(WebCore::DataTransfer::getData const):
(WebCore::DataTransfer::setData):
(WebCore::DataTransfer::setDataFromItemList): Sanitize the HTML before placing into the system pasteboard.
(WebCore::DataTransfer::createForDragStartEvent):
(WebCore::DataTransfer::createForDrop):
(WebCore::DataTransfer::createForUpdatingDropTarget):
* dom/DataTransfer.h:
* dom/DataTransfer.idl:
* dom/DataTransferItem.cpp:
(WebCore::DataTransferItem::getAsString const):
* dom/Document.cpp:
(WebCore::Document::originIdentifierForPasteboard): Renamed from uniqueIdentifier. Moved the code to use the origin
string and then falling back to the UUID here from originIdentifierForDocument in DataTransfer.cpp.
* dom/Document.h:
* editing/WebContentReader.cpp:
(WebCore::WebContentMarkupReader::shouldSanitize const): Added.
* editing/WebContentReader.h:
(WebCore::WebContentMarkupReader): Added.
(WebCore::WebContentMarkupReader::WebContentMarkupReader):
* editing/cocoa/WebContentReaderCocoa.mm:
(WebCore::createFragmentFromWebArchive): Extracted out of WebContentReader::readWebArchive to share code.
(WebCore::WebContentReader::readWebArchive):
(WebCore::WebContentMarkupReader::readWebArchive): Added. Reads the web archive, replace all subresource URLs by
blob URLs, and re-generate the markup using our copy & paste code. The last step is requied to strip away any privacy
sensitive information as well as potentially dangerous JavaScript code.
(WebCore::stripMicrosoftPrefix): Extracted out of WebContentReader::readHTML to share code.
(WebCore::WebContentReader::readHTML):
(WebCore::WebContentMarkupReader::readHTML): Added. Only sanitize the markup when it comes from a different origin.
(WebCore::WebContentReader::readRTFD): Added a nullity check for frame.document().
(WebCore::WebContentMarkupReader::readRTFD): Added.
(WebCore::WebContentMarkupReader::readRTF): Added.
* editing/markup.h:
* editing/markup.cpp:
(WebCore::createPageForSanitizingWebContent): Added.
(WebCore::sanitizeMarkup): Added. This function "pastes" the markup into a new isolated document then reserializes
using our serialization code for copy. It strips away all invisible information such as comments, and strips away
event handlers and script elements to remove potentially dangerous scripts.
* platform/Pasteboard.h:
* platform/ios/PasteboardIOS.mm:
(WebCore::Pasteboard::readPasteboardWebContentDataForType): Now that this code can be called by DataTransfer, added
the checks for the change count to make sure we stop letting web content read if the pasteboard had been changed by
some other applications. To do this, turned this function into a member of Pasteboard. Also changed the return type
to an enum with tri-state to exist the loop early in the call sites.
(WebCore::Pasteboard::read):
(WebCore::Pasteboard::readRespectingUTIFidelities):
* platform/ios/PlatformPasteboardIOS.mm:
(WebCore::safeTypeForDOMToReadAndWriteForPlatformType): Treat RTF, RTFD, and web archive as HTML.
* platform/mac/PasteboardMac.mm:
(WebCore::Pasteboard::read): Add the change count checks now that this code can be called by DataTransfer.
* platform/mac/PlatformPasteboardMac.mm:
(WebCore::safeTypeForDOMToReadAndWriteForPlatformType): Treat RTF, RTFD, and web archive as HTML.
Tools:
Added tests for sanitizing HTML contents for copy & paste and drag & drop.
* TestWebKitAPI/TestWebKitAPI.xcodeproj/project.pbxproj:
* TestWebKitAPI/Tests/WebKitCocoa/CopyHTML.mm: Added.
(readHTMLFromPasteboard): Added.
(createWebViewWithCustomPasteboardDataEnabled): Added.
(CopyHTML.Sanitizes): Added.
* TestWebKitAPI/Tests/WebKitCocoa/CopyURL.mm:
(createWebViewWithCustomPasteboardDataEnabled): Added to enable more tests on bots.
* TestWebKitAPI/Tests/WebKitCocoa/PasteRTFD.mm:
(writeRTFToPasteboard): Added.
(createWebViewWithCustomPasteboardDataEnabled): Added.
(createHelloWorldString): Added.
(PasteRTF.ExposesHTMLTypeInDataTransfer): Added.
(PasteRTFD.ExposesHTMLTypeInDataTransfer): Added.
(PasteRTFD.ImageElementUsesBlobURLInHTML): Added.
* TestWebKitAPI/Tests/WebKitCocoa/copy-html.html: Added.
* TestWebKitAPI/Tests/WebKitCocoa/paste-rtfd.html: Store the clipboardData contents for
PasteRTF.ExposesHTMLTypeInDataTransfer and PasteRTFD.ExposesHTMLTypeInDataTransfer.
* TestWebKitAPI/Tests/ios/DataInteractionTests.mm:
(DataInteractionTests.DataTransferSanitizeHTML):
LayoutTests:
Added tests for copying & pasting and dragging & dropping HTML contents.
* TestExpectations:
* editing/pasteboard/data-transfer-get-data-on-drop-rich-text-expected.txt: Rebaselined.
* editing/pasteboard/data-transfer-get-data-on-paste-rich-text-expected.txt: Ditto.
* editing/pasteboard/data-transfer-get-data-on-paste-rich-text.html: Modified the test to strip away platform specific
inline style properties.
* editing/pasteboard/data-transfer-get-data-on-pasting-html-uses-blob-url-expected.txt: Added.
* editing/pasteboard/data-transfer-get-data-on-pasting-html-uses-blob-url.html: Added.
* editing/pasteboard/data-transfer-set-data-sanitizes-html-when-copying-expected.txt: Added.
* editing/pasteboard/data-transfer-set-data-sanitizes-html-when-copying-in-null-origin-expected.txt: Added.
* editing/pasteboard/data-transfer-set-data-sanitizes-html-when-copying-in-null-origin.html: Added.
* editing/pasteboard/data-transfer-set-data-sanitizes-html-when-copying.html: Added.
* editing/pasteboard/data-transfer-set-data-sanitizes-html-when-dragging-in-null-origin-expected.txt: Added.
* editing/pasteboard/data-transfer-set-data-sanitizes-html-when-dragging-in-null-origin.html: Added.
* editing/pasteboard/data-transfer-set-data-sanitizes-url-when-dragging-in-null-origin.html: Removed the superflous
call to setTimeout that was errornously added during debugging. Also updated the test to not claim all URL and
HTML values are read in the same origin, and updated the assertion for cross-origin case as it's now sanitized.
* editing/pasteboard/onpaste-text-html-expected.txt: Rebaselined. The order of CSS properties have changed.
* http/tests/security/clipboard/copy-paste-html-across-origin-sanitizes-html-expected.txt: Added.
* http/tests/security/clipboard/copy-paste-html-across-origin-sanitizes-html.html: Added.
* http/tests/security/clipboard/copy-paste-url-across-origin-sanitizes-url.html:
* http/tests/security/clipboard/resources/copy-html.html: Added.
* http/tests/security/clipboard/resources/copy-url.html: Renamed from copy.html.
* platform/ios-wk2/editing/pasteboard/data-transfer-get-data-on-paste-rich-text-expected.txt: Remoevd.
* platform/ios-wk1/editing/pasteboard/data-transfer-get-data-on-paste-rich-text-expected.txt: Remoevd.
* platform/mac-wk1/TestExpectations:
git-svn-id: http://svn.webkit.org/repository/webkit/trunk@223440 268f45cc-cd09-0410-ab3c-d52691b4dbfc
diff --git a/LayoutTests/TestExpectations b/LayoutTests/TestExpectations
index 81f5228..1930a37 100644
--- a/LayoutTests/TestExpectations
+++ b/LayoutTests/TestExpectations
@@ -74,7 +74,9 @@
editing/pasteboard/data-transfer-get-data-on-drop-rich-text.html [ Skip ]
editing/pasteboard/data-transfer-get-data-on-drop-url.html [ Skip ]
editing/pasteboard/data-transfer-is-unique-for-dragenter-and-dragleave.html [ Skip ]
+editing/pasteboard/data-transfer-set-data-sanitize-html-when-dragging-in-null-origin.html [ Skip ]
editing/pasteboard/data-transfer-set-data-sanitize-url-when-dragging-in-null-origin.html [ Skip ]
+
editing/pasteboard/drag-end-crash-accessing-item-list.html [ Skip ]
editing/pasteboard/data-transfer-item-list-add-file-on-drag.html [ Skip ]
editing/pasteboard/data-transfer-items-drop-file.html [ Skip ]