Wanna see what a .docx file looks like!
Have you ever got into a position where you wanted to pull your hair off just so you can extract images enclosed in a "fabulous" Microsoft Words file. Well if you are using the old version, having the .doc extension - keep pulling your hair :)
BUT, if you are on a Word 2007, with the .docx file extension, I could show you how you can get everything out of the file. You know I really hate Microsoft especially its approach to "conceal" everything thing. Just 10 minutes ago I wanted to extract a couple of images from a word file, well not exactly a couple of images - a couple of "hundred" images actually. I tried everything, I tried to copy paste into Photoshop but that stupid Microsoft Word won't copy it to the clipboard so Photoshop could get it. NO! I tried to export it as a web page, thinking it would make an html file and a folder just next to it with all the files in it so I could get my images from there, but guess what, I did that and it added an underscore (_) instead of the last charachter of image file extension. It turned ".jpg" into ".jp_". I do know how to solve it, but I don't want to do it in a hundred file. Anyway I tried taking the print-screen and then pasting it in photoshop, I mean I could do it but it would take like forever to edit all the pictures that way, saving files and making new files etc.
So here is what I did, I heard that Microsoft 2007 file format (that ends with an "x" at the end of its old-version extension) is all XML based, I tried out multiple things like renaming it to xml etc but then I did something that slit-opened the file and showed me everything it has in it. I really appreciate Microsoft's move to make its file XML based so it would be compatible with other programs / standards.
Here is what you can do to extract the images from a .docx Ms Word file.
BUT, if you are on a Word 2007, with the .docx file extension, I could show you how you can get everything out of the file. You know I really hate Microsoft especially its approach to "conceal" everything thing. Just 10 minutes ago I wanted to extract a couple of images from a word file, well not exactly a couple of images - a couple of "hundred" images actually. I tried everything, I tried to copy paste into Photoshop but that stupid Microsoft Word won't copy it to the clipboard so Photoshop could get it. NO! I tried to export it as a web page, thinking it would make an html file and a folder just next to it with all the files in it so I could get my images from there, but guess what, I did that and it added an underscore (_) instead of the last charachter of image file extension. It turned ".jpg" into ".jp_". I do know how to solve it, but I don't want to do it in a hundred file. Anyway I tried taking the print-screen and then pasting it in photoshop, I mean I could do it but it would take like forever to edit all the pictures that way, saving files and making new files etc.
So here is what I did, I heard that Microsoft 2007 file format (that ends with an "x" at the end of its old-version extension) is all XML based, I tried out multiple things like renaming it to xml etc but then I did something that slit-opened the file and showed me everything it has in it. I really appreciate Microsoft's move to make its file XML based so it would be compatible with other programs / standards.
Here is what you can do to extract the images from a .docx Ms Word file.
- Duplicate your .docx file
- Rename the file to .zip instead of .docx - You don't wanna do this "filename.docx.zip" just "filename.zip"
- DO NOT make the file a zip file (i.e. "add to archive") by using WinRAR or any other zipping program, just RENAME IT
- and when you have renamed your .docx to .zip just double click (you need a zip file viewer to view the file)
- You will find all the things in there in proper folder hierarchy.
- Go to the folder "word >> media" and find all the images there.


4 comments:
There is a nice tool which can to do similar actions-how to fix docx,as far as I know it has free status,also application helped me yesterday,and try it,software export recovered data into a new Word document,program scans damaged document, fix docx open and analyzes the data,will recover it and show in a preview window, where the user can check it and make sure, that it is recovered properly,can fix damaged docx and recover documents of different formats (*.doc, *.docx, *.dot, and *.dotx) as well as *.rtf (rich text files),can extract the data and fix docx file from any removable media even via local network of your organization.
Thanks a bunch for the suggestion. It really helps. I'll keep in mind whenever I have to deal with docx files. :)
Well I tried Nadir's way (rename .zip) and it didn't work -I didn't get the folder "media..."
And Alexis doesn't provide any concrete direction....so much for .docx!
Did you see the folder hierarchy or not at all? You could share what happened... I'm guessing you saw other folders after renaming the file to .zip but you couldn't find the Media folder? If that's not the case tell me what happened.
Post a Comment