Welcome to part two of our three-part series on how to do more with your Yardi data. Written by Vimal Vachhani, a senior data architect at Saxony Partners, this series focuses on how data can be extracted from Yardi and used to create actionable reports and give you a competitive advantage.
In part one, Vimal provided step-by-step instructions on to get your Yardi through utilization of software as a service (SAAS) and/or the Private Cloud environment. In part two, Vimal explains how to utilize “log shipping” top get your data from Yardi:
Until recently, obtaining data externally from Yardi was reserved to two methods: via a nightly backup of the database, or via direct SQL query access over VPN (if you were paying for a private cloud).
Both worked well, but with minor flaws.
The nightly back up process would allow you to only refresh your data environment once a night with the previous day’s data. As organizational reporting and analytics requirements evolve and faster, more real-time updates are needed, having old data can be a hinderance. Not to mention, it can take a lot of time to pull a large database from the FTP server, unzip it and restore it.
The second method (direct SQL query) solved several of those issues. You could query data ad hoc and build change capture logic directly into the layer. You could use efficient load logic to identity and load only new records. This could tax the server’s memory, however, as look-ups comparing millions of rows of data to determine specific information can be quite intensive. In any case, many Yardi clients are not using private cloud – making the option inaccessible.
This leads us to the new method of getting your data from Yardi, which is a natural evolution from the original back-up process.
Log shipping data restoration allows you to automatically send transition log backups from the primary database to a secondary database. The transaction logs only store the changes from the previous backup, creating a series of smaller backup logs that can be applied to the secondary server, creating a replicated read-only database. (Click here for more information on SQL server log shipping).
Yardi will provide clients with a shared folder where transaction logs are created and stored from the primary server (similar to the database back backup method). You can then build a process to automatically pick-up these files. Once you decompress the files (using the LiteSpeed back up tool) and store them locally, then you can apply them to the secondary database on your local environment by following these steps:
- All user sessions will need to be terminated. As each log should only take a few minutes to apply, it’s best to have an SLA with your users so they know when they will be kicked off the server. If you are restoring multiple logs, then you should script this process and run it for every log file. This prevents any sneaky users or processes from jumping back on and causing an error in the restore.
- The database must always remain in read-only or stand-by mode. If you switch out of this mode, then logs will not load. Then, you would need to do a full restore to allow logs to be applied.
- Script the process to restore logs in the right order. In my case, I’ve supplemented our SQL scripts with an ETL control tables to make sure the right files get applied in the right order. There’s no need to worry about reapplying older files as SQL will be smart enough to know those are not needed and will skip over them. This SQL script can then be packaged up in your ETL tool.
- Users cannot be created on the secondary database. Those needed for the secondary will need to be created on the primary database – and the SUSER_SID can be extracted via a query. Then, on your local server where the secondary is stored, you can create a new user sharing the same SUSER_SID. This workaround allows you to create your own ID and passwords for access to the secondary without having to share any credentials on the primary server.
It’s important to note that before the log shipping process can be set up, you will need to do a full restore of an entire database as a starting point. It is recommended this process be set up as a job in case there is a hiccup in the log shipping method. This allows you to re-baseline the data, which can be run over a night or weekend and used as a disaster recovery process in the case something does go wrong.
This allows you a truly replicated instance locally to use for reporting and analytics, giving you up-to-date data for powerful insights and more instant reactionary reporting.