Python: How to Automate Word Docs

Header

Every time I start thinking that my job is bad, I try to remind myself that Stanley Kubrick had a secretary. The guy was notorious for driving actors into the ground. Making them do takes over and over and over. Can you imagine having to handle his administrative work?

If IMDB’s trivia page is to be believed, that poor soul had to spend weeks typing out the infamous ‘All work and no play makes Jack a dull boy’ novel from hell. On the off-chance I ever find myself similarly working for a maniac, I’m attempting to master the Python docx module, and I’ve been pleasantly surprised with how far you can get with just the basics.

Intro to python-docx

Alright, navigate to your Python folder in the command prompt and type in:

pip install python-docx

Like so:

Command Prompt

Once that’s installed, import it in the IDLE and set up a document object. You can either reference one that already exists through the file path:

import docx
doc = docx.Document('C:\\Users\\Desktop\\Test.docx')

Or you can make a new document:

from docx import Document
doc = Document()

How to Add a Paragraph

Documents are broken up by paragraph.

You can add text to a document by adding a new paragraph. It won’t show up immediately. You’ll need to save the it to see the changes. Then poof. Like magic:

doc.add_paragraph('This is a new paragraph.')
doc.save('C:\\Users\\Desktop\\Test.docx')

Paragraph
Now that you have a paragraph, you can access it by index. Remember, these start counting at 0. So if you have the above document  saved as doc in the shell, you can see the text like this:

Print Textknow
How to Add Text to a Paragraph

Paragraphs are broken up by runs. Somewhat confusingly, these are not necessarily sentences.

They are lines within a paragraph that have a specific style. Bold, italics, font color. When you change anything about the style, you start a new run. However, if you add a run using the below line, it’s separate from the one before even if the style is the same.

doc.paragraphs[0].add_run(' This is the next sentence.')

Python Runsway
How to Change up the Style

You can reference the run and assign a True/False value to turn things like boldness or underlining on and off. Just remember, capitalization counts.

doc.paragraphs[0].add_run(' This is a bold sentence.')
doc.paragraphs[0].runs[2].bold = True
doc.paragraphs[0].add_run('This is in italics.')
doc.paragraphs[0].runs[3].italic = True
doc.paragraphs[0].add_run('This is underlined.')
doc.paragraphs[0].runs[4].underline = True

How to Add Pictures

You might be wondering what a practical application would be for this? Why on earth would you want to Python a Word Document?

For starters, say you’ve got a bunch of images, conveniently named sequentially…

Images

You can plug those into a loop and have Python put them in at a specific size. Just remember to concatenate the file path strings with + signs. And within the loop, you’ll need to convert x to a string to include it in the file path name.

from docx import Document
from docx.shared import Inches

doc = Document()
filepath = 'C:\\Users\\Desktop\\'

for x in range(1,13):
doc.add_picture(filepath + str(x) + '.tif', width=Inches(8.0))

doc.save(filepath + 'Hungry.docx')

Open the document and you now have 12 images of a precision-sized Hungry dude.
San Jose?
I'm Hungry

 

 

2 Replies to “Python: How to Automate Word Docs”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: