Hi everyone!
So this is an update to a blog post I made a while back, you can find that here!
The article wasn’t very well written and could definitely do with some improvements, hence why I’m coming back to it. I’ve also redone the title to be much simpler, settling on ‘PST Importing With AzCopy‘
I’ve also put this new post on my Medium page, in case you prefer reading there!
Terms
O365/Office365 – A comprehensive suite of cloud-based productivity tools and services developed by Microsoft. It encompasses a range of applications and services designed to facilitate communication, collaboration, and productivity within organizations. Users can access familiar Office applications such as Word, Excel, and PowerPoint, as well as communication tools like Outlook and Teams, all hosted in the cloud.
AzCopy – A command-line tool developed by Microsoft that facilitates the transfer of data to and from Azure Storage. It is designed to efficiently copy and synchronize large volumes of data between on-premises environments and Azure cloud storage. AzCopy supports various storage services within Azure, such as Blob storage, Azure Files, and Azure Table Storage.
PST – An acronym for ‘Personal Storage Table’. This is a file format used by Microsoft Outlook to store email messages, calendar events, and other data items on a user’s local computer. PST files enable users to archive and organize their Outlook data, providing a way to manage and store emails and other information offline.
Azure – Developed by Microsoft, is a comprehensive cloud computing platform that provides a wide range of services and solutions to help organizations build, deploy, and manage applications and services through Microsoft’s global network of data centres.
Why?
In my previous job, I routinely had to import a huge amount of PST files into our Office365 environment so that they could be added to a user’s mailbox.
The manual way sucks! There is a manual way to do this, it’s a pain in the back-side and also leaves a lot of security and compliance issues on the table. You’d have to manually give yourself access to the user’s mailbox, add their mailbox to Outlook and then import the PST. Amongst the other issues, this is a synchronous task as Outlook will be completely frozen until the entire PST imports. It can also crash without warning, and without telling you where it failed.
The Better Method
The better method for importing a large number of PST files for multiple people is to leverage AzCopy, Azure and the built-in Office365 Compliance portal.
This allows you to import the PST files in an automated, error handling and compliant manner. Leveraging a mapping file so that Office365 knows exactly which mailbox the PST needs to be imported into to.
Lets Start – PST Importing With AzCopy!
Import Request
First things first, we need to create a new import request in Office365. You can do this through the below steps:
- Login to the Compliance Portal in Microsoft Edge
- Head over to ‘Information Governance‘
- Then into ‘Import‘
- Once there, create a new import and name it according to your current project
Once that has been created, you can select the option to upload your data, instead of physically shipping media to Microsoft. I’ve never used that option, let me know if you have. I’d be interested to know more about that process from someone that has done it ‘first hand.’
Now you’ve select the ‘Upload your data‘ option, you’ll receive a very long SAS URL. Make sure to copy this link and save it somewhere, you’ll need it later on and you might need it even further down the road.
This is also why I told you to use Microsoft Edge in the first step above, Microsoft prevent you from downloading AzCopy program in any other web browser. It’s a pain, I know! But at this stage, you should have the option to download the Azure AzCopy program from the same window.
Data Upload
Now that we have the Azure AzCopy program and the SAS URL, we can go ahead and upload the PSTs to Microsoft.
Open a PowerShell prompt in the same location as the AzCopy program and run the below command:
azcopy.exe "path/to/folder/with/the/psts" "SAS URL" -recursive
The -Recursive parameter seems to be required as I’ve always had it flat out complain if it doesn’t have it.
You can now leave this program to run, depending on the amount of PST files and their respective sizes, it could take a long time. The longest it’s taken for me is 3 days but that was a huge import.
I really do like this program though, as it gives you information as to where it is with the imports. There are too many tools out there which don’t give you this information and just show you a black screen and you don’t know if it’s working or not. Azure AzCopy is cool!
It’s also worth noting, that whilst the PST files are uploading, you can leave the Compliance window and even close it completely. You’ll be able to move onto the next step whenever the upload is complete
Mapping File
The mapping file is next. We need to tell Office365 how we want to import the PST files we’ve uploaded.
This step is often the most confusing, but I’ll run you through it. You can grab a template of the mapping file from Microsoft.
If you open the CSV file in the program of your choice, you’ll see the columns that need to be filled in. For all of my imports, I’ve always left the following columns blank: TargetRootFolder; ContentCodePage; SPFileContainer; SPManifestContainer and SPSiteURL.
Actually, that’s a lie. The TargetRootFolder is handy because you can tell the PST file to be imported into a specific folder in their mailbox rather than into the root of their mailbox. Makes the import cleaner in some scenarios. So in the past, I’ve put something like Imported_PST instead so that they know exactly where their imported mailbox data is.
Since my PST files where on a server in the following path: C:\Company\over_20gb. I needed to set the FilePath column in the CSV to ‘over_20gb‘ for each row.
You can see an example of this mapping file in the below image:
Final
Before you hop into this step, make sure that the following are true:
- You have uploaded all your PST files
- You have your mapping file prepared
Awesome! Lets jump into the final step. Follow the below steps:
- Login to the Compliance Portal in Microsoft Edge
- Head over to ‘Information Governance‘
- Then into ‘Import‘
- Find the import request you make previously in step 1
From here, you can tick both boxes to confirm that the data upload is complete and that you have access to your mapping file. I believe these ticket boxes show as ‘I’m done uploading my file‘ and ‘I have access to the mapping file’.
You can now use the Select Mapping File button and upload your file. After this, you can use the Validate button which will do some initial checks on the mapping file you are trying to use.
If all is well, you can now chose whether to filter the data you uploaded or to import everything. Filtering is useful as you can tell it to upload recent emails, archive emails or certain folders too. This is by fair the easiest way to do this, such a useful step from Microsoft. The only negative is that this filtering is done for all the PST files you uploaded to this import request.
Meaning, if you wanted to use different filter values for the PST files, you’ll need to create a separate import request for each filter you’d like to use.
After this, you can start the import and you’ll get an accurate progress bar and useful logs after the import is completed.
Just be prepared to wait a while since this process can take a very loooooooooong time!
Enjoy! 🎉