Version control of database objects?

3 August 2005 at 14:34 CEST | In Other, Software development |

In the coming months I will be introducing Subversion to Eurotransplant for version control of all our software. This is part of a bigger plan to learn from 3GL development. Using Subversion is pretty straightforward for things like PLL libraries and FMB files, but what about database objects?

Stored PL/SQL objects (packages, function, procedures and triggers) are also quite straightforward. All these DDL script have CREATE OR REPLACE commands and will replace any previous version of the PL/SQL object.

But what about tables? You initially create these with a CREATE TABLE and during the rest of its live it might have numerous ALTER TABLE statements to change the table. How to keep these in version control?

I came up with a couple of things to consider:

Track end state of the object or modifications?
I’m struggling with this one. On the one hand I would like to store only the table structure as I want it to be in the database. This could be a XML document as created by the DBMS_METADATA package. I would then need something to analyze this XML file and compare it to a current version in the database. It would then need to make any necessary changes to the table in the database to match the structure in the XML file.
The other possibility would be to have the developer record just any ALTER TABLE statements that are necessary to change the table. This gives the developer more freedom in complex data-conversions. What about adding a new column to a table, filling this column with a concatenation of two other columns, drop the two other columns, change the new column to NOT NULL. This could be a real world conversion, but I do not see me expressing these sort of conversions in XML and then deploy them to a database automatically.
So I guess I’ll go for the ALTER TABLE history in one file.

Support for automatic builds
To improve our quality and productivity I want to be able to automatically release a version from subversion to a database. We constantly refresh our development/test databases with (made anonymous) data from our production database. I would like to be able to reset a development database to the production version and then run all changes that are made in Subversion against this database. This automatic build might even include a number of automated tests.
If I go for a SQL file with lots of ALTER TABLE history (see first point), then how would I have this automatic build detect the current state of the database. If there are 10 ALTER TABLE statements in the SQL file, how do I detect which ones have already been run against this database and which still have to be run.
One solution would be to convert the original SQL file from version control to a (set of) SQL file(s) with some extra commands. They could log each executed ALTER TABLE in a sort of dictionary-view. The script could then detect which statement have already been run and can be skipped.
One other way would be to store snapshots of the table state in the original SQL file. I could again use the XML documents as created by the DBMS_METADATA package. After each (series of) ALTER TABLE statement(s) a XML document is included with the state of the table at that moment. When deploying the SQL file to a database I could first get the current state of the table, search for it in the SQL file and start from there.

Support for branching and merging
We do not deliver our software to external customers. We have only one production database running and a couple of development/test environments. This makes live a bit easier. Let’s look at a scenario. We’re approaching the next release. We create a branch of the mainline in Subversion to prepare for release 2.5. Developers can then continue development on the mainline to prepare for version 2.6. Version 2.5 is released to production from its branch. Then a bug report comes in that requires an immediate fix. This fix is developed on the 2.5 branch is labeled as version 2.5.1.
We now have a case where developers have already made changes to a table for version 2.6 and another developer has made changes to the same table on the 2.5(.1) branch. They 2.5.1 developer must be able to merge his changes back in to the mainline.

Special care is needed for merging these changes back to the mainline and still have the automatic build script be able to deploy it to a database. It has to be able to deploy the SQL file to a database which already has the 2.5.1 patch installed as this will be the case on the final deployment to production. It would also be very nice if the automatic build is able to deploy to a database that doesn’t have the 2.5.1 patch yet, but does have some of the 2.6 changes. This can be the case on development/test environments that you do not want to refresh completely as there might have been entered a valuable test-case to that database.

I’m slowly getting there, but I guess it will take some more thinking before I have my final solution. I’ve also been googling for this, but couldn’t really find information how others have solved these issues. Tomorrow I will be attending a mini-seminar at Amis about software development for both Oracle and Java, so let’s hope I can find some answers there. There must be readers of the OraBlogs community that have experienced these same issues? If so, please leave a comment to explain how you solved it.

Update 5-aug-2005: If you’re interested in this subject, you have to checkout the blog of Robert Baillie. Especially his two latest postings on the subject (here and here) are very very very interesting.

Update 5-aug-2005: Googling for the subject resulted in some more perhaps interesting links: here, here and here.

34 Comments

TrackBack URI

  1. Hi , I’m straggling with those issues.

    Here is my ideas so far:
    1. I’m currently builing a utility to extract the db to files (very much like exp -but each object will have different file).
    2. I will use ant task to find extract from source safe (vss) and extract from database and compare.
    3. I will use ant sql task as the main installer utility.

    Comment by amihay gonen — 3 August 2005 #

  2. Hi Amihay,

    I think I’m also going to build everything around Ant for build automation. You say you’re going to extract the DB to files and then compare these to the version in sourcecontrol. I can understand this for PL/SQL and other things you can just CREATE OR REPLACE (e.g. views). But how about tables? Let’s say your object in version control has a couple of extra columns then the table in the database. How are you going to correct that? Are you generating all necessary DDL to alter the database to resemble the version in sourcecontrol? This sound like a hell of a job.

    Once I’m back in the office (two weeks from now) I’ll start working on the subject and I’m planning on writing some sort of whitepaper and publish it here on my blog.

    Comment by Wilfred — 3 August 2005 #

  3. We came up with what I think is a great solution to this problem.
    This is the basic solution…

    Whenever a developer / pair of developers need to change the database they write a patch. This patch contains any alter tables / data migrations / etc. A single file contains the full set of changes for a single conceptual change and may cover changes to several tables.

    These patches are then installed by a ‘patch runner’. The runner logs when a patch starts in a table, and marks the completion or failure of that patch when it finishes.

    By looking in the patch log before it runs a patch , the patch runner ensures that it never attempts to run a patch that has previously been marked as complete.

    In addition, if any patch fails then the patch runner aborts, stating that it failed. “Crash don’t trash!”.

    There is then a list of patches to run for the application that developers just add to the end of. This, and all the patch files are, of course, in version control.

    So, you check out a given tagged version of the application to run against an arbitrary database and run the patch runner. It loads the list of patches that are applicable for that tagged version. It runs over each patch in turn and checks if it has previously ran. If it has not, then it runs the patch.
    By the time it reaches the end of the list it has ran all the patches, none of them twice. You can even run the patch runner twice in succession and guarantee that the second run will not change the state of the database.

    The key to that is then to make sure that you use the build often. Get your developers their own database workspaces and make sure that that rebuild and upgrade them often.

    As an aside, we also write tests for the patches, checking the pre and post conditions are valid… but that’s a very different topic!

    Comment by Rob Baillie — 3 August 2005 #

  4. Code controlling the DB is a tall order at best. I’m running an 11i apps shop and we use Applimation for code control. It has built in promotion and generation routines for all the normal apps file types and can be customized for all different file extensions. I think your use of Subversion and Ant would be a great solution though. Good luck with it!!

    Mark

    Comment by Mark Coleman — 3 August 2005 #

  5. I agree with Rob, the best way to do this is by having the developers write patch scripts. You are basically patching an existing product, your production database. This also means that your nightly builds are easy, just reapply the patch script on top of your copy of the production database.

    The only drawback of this method is that you don’t have one script which shows you the complete create statement of a table, but if a developer really needs this, he can always use toad to get this info.

    Comment by Andrej — 4 August 2005 #

  6. Hi,

    the solution for this problem within our development environment is simple yet effective. We use CVS for versioning.

    There is a single file containing all statements regarding the creation of a table (e.g. including PK, indexes etc.).

    Initially there is only a create table statement. At the moment there is an alter table statement, the initial table statement is changed from source code to documentation (whitin the sourcefile) using standard SQL features. The alter table statement then becomes the code. This way a chain of alter table statements is created whithin the sourcefile. Though the initial table statement or earlier alter table statement seem only to be there for documentation, when isolated it is run-able in the database.

    Using the CVS option “compare” differences between versions of the source can be extracted, delivering either the initial statement either an alter table statement.

    The biggest advantage we experience is that there is only ONE sourcefile containing all information on the table including the table’s history.

    Jules.

    Comment by jules de ruijter — 4 August 2005 #

  7. Jules,

    I’ve seen your solution before (see here). I love the concept of a having the entire history of a table in a single file. This is something that Rob doesn’t have. He has a SQL-script per patch and it’s a bit more difficult to get the complete history of a table.

    However, in your solution how can I support the branching and merging as describerd in the blog post. What if a developer has created a patch on the 2.5(.1) branch, while another developer has continued the development on the mainline for release 2.6. I have to make very sure that the SQL files of both developers are both run on every database and in the correct sequence. I was hoping for some sort of mechanishm that would just compare the database with the SQL-files and apply whatever changes necessary. This is much easier with Rob’s concept, but then I loose the history :-(

    What do you use for this branging and merging, or is this not an issue at Centraal Boekhuis? I would like to support frequent (at least daily) automatic builds on numerous environments.

    Comment by Wilfred — 4 August 2005 #

  8. My brain is now working overtime trying to work out how to give you the table history, whilst keeping the flexibility of our patch runner without compromising the effort free upgrade process. I have to admit that I’m finding it difficult!

    However, as I imply in the comments in response to the post on my own blog, I really don’t see it as that much of a loss. Still, you do, and so it’s a valid point, and so one that I feel as though I’m going to have to address eventually.

    One of the huge advantages we find of our patch runner, when embedded into the bigger picture of a database build script. It is almost trivially easy for us to set up new environments, and we don’t need to get a DBA or release manager involved to do it.

    As long as the database instance already exists, it’s simply a case of running the initial table builds and then letting the patch runner upgrade the objects. Once that finishes, the database views and code are installed.

    The only difference between an upgrade and a build from scratch is that the initial table builds are ran on a build from scratch. And thinking about it… there’s no real reason why that can’t be done in a patch as well… that way there would be no difference.

    And that’s the biggest advantage. We never have to compare files in CVS in order to find which ones we need to run; we never have to tailor a build script for a particular database; we never need a build manager to ‘prepare’ a release.

    We check out the tagged version from CVS and let the patch runner deal with it.

    As standard practice now, whenever a pair of developers update the code workspace they run the database build to make sure thier database is up to date as well. This can mean that the patch runner is ran many times in a single day, by many different people, on many different databases at many different versions.

    We don’t yet perform continuous integration in our environment, but the database build is most definitely not the thing that stops us!

    Now, if only I could find a way of auto-magically generating your table history…

    Comment by Rob Baillie — 4 August 2005 #

  9. Hi Rob,

    I’ve been thinking about this today also. One thing came to mind:

    Have a SQL script per table and add ALTER statements to that. Enclose the ALTER statements with some sort of comment tags specifying the patch-number. That way the patch runner should still be able to go through all files and extract the necessary statements for each patch. If you still want to see the SQL scripts for each patch, then I could first use some sort of (Perl?) script to transform the table SQL scripts to patch SQL scripts. But how do I enforce the sequence of the different statements in a patch. Probably not the best way to go.

    Another way would be to stick to your way with a SQL script per patch and just store all statements in the database. Having them in the database makes it a bit easier to query the history of a table. Downside to that is that the history is in the database and not in version control.

    A simular way would be to have system triggers that log all DDL statement against partcular objects. The PatchRunner could even set the number of the patch in some sort of session/package variable before executing the patch. The system triggers could pick up this patchnumer and record that along with the table history.

    Just some thoughts… I’ll probably keep thinking about this for a couple of days and have to let the ideas mature.

    Comment by Wilfred — 4 August 2005 #

  10. Wilfred,

    On first thoughts, I really like option 3. This could all be dealt with by the patch runner itself, leaving the patches themselves unaffected.

    Not increasing the work involved in producing a patch seems like a good idea to me!

    We have an ID on our patch log, so we can easily see what order patches were applied. Each patch is also named, in our case the name being driven by the story for which it is created. It could easily be a bug number, a change request number, etc. The combination of the two means that it is easy to see the order the patches have been ran on a database, and tie them back to the originating file.

    The patch runner can store the name of the patch currently running in a package variable and then a DDL trigger can log the changes that are made due to that trigger in a patch_log_detail table.
    As you say, you can then query this back at your leisure to produce a table centric view of the patches that have applied.

    A concern that you voice is that those details are held in a table, rather than in version control, but you also state that you are running nightly builds.

    If you really wanted the data in version control, you could have the patch runner produce a report in the format you want. You can then get your nightly build to copy this file into its workspace, and commit it before the build tags the version up. Et voila, your table centric list of changes sitting in version control.

    The theory seems sound to me…

    I’m going to have to let this sink in a little now, but I can feel another blog entry of my own coming on!

    Cheers Wilfred… great post, great conversation!

    Comment by Rob Baillie — 5 August 2005 #

  11. This topic has been a great interest of mine for years. I’ve been part of a CMM Level 3 group for almost 10 years and oddly enough we release DB code almost identically between the most rigid projects and the most agile. We maintain a baseline of all db objects and a version controlled collection of release specific change scripts. A couple of cases: a new table would include a *.ddl baseline script called from a release specific script. A new column would be added via a release specific script, with the ddl script modified, but not released. A little religion here, but if you are doing any of this without Toad you are wasting time generating stuff that has already been automated. I have been on their beta program since 7.6 and they are without question the Oracle industry leader in development tools. I think the Toad development team would jump at the idea of an integrated build tool.

    I still think we are far from a product that you could entrust to any 12 year old on a Sunday night to install into production. How do you automate data scrubs that error due to data introduced on the day of release? Without a workflow process how do you continue to apply the release? Do you stop, rollback/recover/call to correct before automatically applying stuff that only leaves one recovery solution? How much effort do you dedicate to total automation when you still need someone smart enough to answer the previous question?

    Comment by Chuck — 13 August 2005 #

  12. Wilfred…will this thread ever die ;-)

    Chuck,

    Sorry, I just can’t resist this sort of conversation… it turns out that building release structures is turning out to be a bit of passion. Damn!

    Before I start, I really need to say that I don’t mean any of this aggressively, or in order to cause offence… I am genuinely interested in your process and point of view. Forgive me if my questions are way off the mark… I’m just a little confused!

    You say that you:
    “maintain a baseline of all db objects and a version controlled collection of release specific change scripts. A new column would be added via a release specific script, with the ddl script modified, but not released”

    By this, am I correct in assuming that if you need to make a table change in a system that has an active trunk line of development and a branch covering the production version, you may be required to make the change in three places:
    1)The trunk release script.
    2)The live branch release script.
    3)The baseline DDL.

    Also, that the baseline DDL would never be released?

    Is that right?

    If so… why?

    In particular:
    If the baseline DDL script is never released, why is it maintained? What purpose does it serve?
    If there is a script per release, what is the thinking behind this? Is there a move to consolidate these?

    I absolutely agree that Toad is a great development tool and every Oracle developer should get a copy and at least try it out for a month. They let you do this for free you know ;-)
    However, I have never used it as a tool for generating DDL statements that I would ever drop straight into a release structure. Again, I think I’m missing some salient point, or some killer function on the tool. Can you enlighten me as to how you actually use Toad when producing release scripts?

    In answer to your question:
    “How much effort do you dedicate to total automation…”

    OK, I’ll admit it, in our case it was very little.

    But then that’s because we’ve found automation to be simple to implement. Having a solid patch runner embedded within a config file driven build script we’ve found that automation has just happened. That is, our aim was to produce a solid build script that would be difficult to run with the wrong scripts / settings / parameters, etc. In doing so we’ve produced a build script that is easily automated. The two things come together.
    But how much use is it if you still get the errors on the day on the release? Well, the thing is, by making it easy to run an upgrade we’ve made it easy to run upgrades often. Our next step, on the drawing board at the moment, is it to put in place an automated upgrade of a backup from one of our production databases each night. This will include a suite of tests that check the applications running on that database still work. The idea is this: we may not be able to stop the “data scrubs that error due to data introduced on the day of the release”, but we will catch the ones that error due to data that was introduced 5 weeks before, but only spotted on the day of release.

    Finally, I am very interested in what you have to say about workflow processes…

    I’ve always been of the opinion that if you get a failure in an upgrade (or any piece of code) then the best thing to do is stop doing whatever you’re doing and deal with it. The idea is embodied in the phrase “Crash, don’t trash”.
    That is, as soon as an error occurs: a movement away from the valid state; a flaw in an assumption, everything that occurs after that point cannot be guaranteed to be valid: the new instruction starts from an invalid state; we are aware that at least one of our assumptions is flawed.
    From your post I am inferring that your team takes a different standpoint. If so, in what way do you differ?

    Comment by Rob Baillie — 16 August 2005 #

  13. Rob,

    One thing I have learned is that posting + passion + vino leads to some creative discussions. First of all I must confess that I did not read all of Wilfred’s links prior to posting so I am looking forward to digging deeper into Bobablog and some of the other references.

    We dedicate over 3 very senior FTE’s to 4 very different development projects (total ~60 project personnel) just for database migrations and release packaging. Several of the database processes were established 5+ years ago and adapted as newer projects were added to the organization. One my personal digs with the CMM process is the contentment with repeatable processes without significant regard to the effectiveness of the process (at least at the level we jumped off on.)

    To address your specific questions:
    1. We very rarely have to branch off of current development. Emergent releases could force a branch on the baseline DDL script, but the released script (maintained in a release specific version control folder) would be a new script. We have only had to do this a couple of times that I can recall.
    2. The DDL’s are only released on their initial release. We make mods strictly for historical reconstruction in PVCS. On one project the db is shared between 2 applications that are developed by 2 separate contractors. The old contractor was just replaced a couple of months ago by a much cheaper group. Years of experience walked out of the door. The current process forces users to document changes in the version control tool with the specific object that changed, not just the releasable script. I’ll refrain from disclosing our client, but needless to say, some of our projects require some strict accountability for changes. This doesn’t necessarily make anything easier or better, but it does allow you to find the person who implemented the change with the supporting change request.

    Sample: rel 1.0 contains script 1000_1.sql which creates emp.ddl.
    rel 2.0 contains script 1001_1.sql which alters emp to add a new column. Emp.ddl is updated but not packaged with the release.
    3. Script consolidation – currently developers create scripts as needed (e.g. 1000_1.sql, 1000_2.sql). Temporary/non-releasable scripts used to make changes to previously migrated releasable scripts (1000_1tmp1.sql fixes 1000_1.sql). Often the base script cannot be re-executed so it is fixed and not executed in the current release path. The “tmp” script is responsible for making the changes that the original script would have made in its corrected state. Toward the end of the release our dedicated release staff consolidates the developer’s scripts into a run order/installation instructions published in the software release notes.
    4. I am currently celebrating a client purchase of Toad. After 3 failed white papers on why Toad was better than notepad/VI they finally listened! For DDL generation I configured the Schema Browser|Scripts to display the ddl, indexes, constraints, table/column comments, tablespace only, and latest version to remove all of the NO (parallel, logging, etc.) options. Copy/Paste to script Editor, format, apply our standard script header, done. A lot of this is Oracle/Toad version specific, but the later the better WRT storage option omissions.
    5. Not sure if I answered all of the question but we do issue at least 1 script per db object with other scripts that package/sort these specific scripts. Toad helps produce the individual scripts and the new Project Manager has some potential for release organization. If they included the ability to organize/number the items into a run order under the current PM things could be real interesting. Maybe a topic for ToadBeta.
    6. WRT nightly refreshes + release installs, we did use this method a couple of releases ago. It was a somewhat difficult and took some smart people to figure it out even with an existing process that gave us a day old copy of production. The problem with our current release is that our testers spent weeks setting up data and all are at different stages of testing their specific area. A clean rebuild is not possible in these cases.
    7. On a side note: I have used database flashback on 10.1 on a local copy of the database to test pieces of this release. Cool, neat, wow, much potential!
    8. Regarding workflow comments, I think we are on the same page. My concern was an automated tool’s ability to gracefully stop and allow a person to identify and correct the error and restart the process.

    We definitely have some CM processes that drive me crazy as a developer. But, the process is defined to the point that we can change CM coordinators (software + db changes) on the fly with little disruption. As a DB developer/manager I would kill for a tool that allows you to actively manage items in a release and generate build scripts on the fly.

    Comment by Chuck — 17 August 2005 #

  14. Chuck, cheers for the reply, it’s always good to hear how other people are solving the same problems. I’m finally back from holiday, and so back to my commenting self.

    I take your point that you rarely need to branch. There’s no point putting together a solution to a problem that doesn’t exist. We couldn’t really manage without branches, so we needed a solution that did solve that problem.

    Getting on to the way you work, I am still a little confused, though get a feeling that I’m going to remain that way.

    I can see that you feel your process needs to ensure that changes are documented both with the releasable script, and also against the object that has changed. My problem is that I am always worried by anything that breaks the principle “Don’t repeat yourself” (DRY). That is, since your process asks your developers to make the change both in a releasable script and then against the original object, you are effectively asking the developer to do the same job twice. The lack of a table centric view is exactly the stumbling block Wilfred had with my own approach, to which I believe we found a suitable solution. Those notes can be found earlier in this thread, but I’d like to make the original point again. There’s no need to make the change in two places if we can generate the second piece of documentation from the first, and so we choose to make the change by writing the releasable script and make the release mechanism build the other forms we needed.

    As you point out, having this information available allows
    “you to find the person who implemented the change with the supporting change request”.
    Of course, this is a property of the information held, not of the means of creating that information.

    If the releasable script is put into source control along with the rest of the database code, then you still have this accountability. It is still clear who made the change. The release script can then produce the object centric view of the changes as the changes are made. This change history can be put into a table structure, to allow for simple retrieval of that information in many different forms (Which objects have changed in the last 5 days? What object changes did Rob make during his 3-month contract?). It can also generate the table create scripts that you currently have developers creating, if you really feel you need them. You can produce this with a script automatically, rather than use Toad and strip the bits you don’t need. Generating a table create script is actually fairly simple.

    I am even more confused by the approach to fixing a bug in a previously released script. In this situation you appear to change three pieces of code: the original object script, the original patch script and the new patch script. Only the last of these is ever packaged in a release. If this statement is correct, I’m not sure I’ll ever quite get why this is done.

    You then go on to say:
    “Toward the end of the release our dedicated release staff consolidates the developer’s scripts into a run order/installation instructions published in the software release notes.”

    This is one of the tasks our build process aims to remove. By producing the change scripts in a standard form, making the developers responsible for sequencing their individual scripts within the larger build process, keeping the whole of the runnable build together in version control and then running the build regularly (several times a day), we hope to make the build script runnable without any release management tasks taking place. It works in our environment. We don’t have a release manager.

    As for testing, you’re right, you can’t simply remove a system tester’s database and completely rebuild it from scratch overnight. In fact, you probably can’t upgrade it at all without giving the testers warning and letting them decide exactly when that upgrade takes place.

    So where does that put our idea of continually destroying and rebuilding a test database? Well, we simply create a new one. That is, just because we’re looking to test the application doesn’t mean that we need to use the same environment as the rest of the testers. We have the means to create our own environment and build into that. This does then throw up the question, who tests this database, if not the testers? Well, then I think you need to start using automated tests, and that’s a separate topic entirely. It’s also one that I think we have a fairly mature solution to, and one that I’ll be covering on my own blog once I get round to it.

    And finally, you state:
    “As a DB developer / manager I would kill for a tool that allows you to actively manage items in a release and generate build scripts on the fly”.

    Honestly? Because I think that’s what we’ve got, and it only took us abut 4 developers thinking for a day and then coding for a day to get our starting point. We’ve made tweaks over the last 18 months, but nothing too fundamental. We simply started with thinking about all the things in the methods we’d seen that annoyed us, and then set about putting together a process that didn’t have any of those problems. Of course that just meant that we had new problems instead, but then our tweaking is basically removing any new annoyances as they appear.

    Comment by Rob Baillie — 2 September 2005 #

  15. I guess I’m going with Rob here. We’ve had some more internal discussion and I hope to get working on our version and release management this month.

    I to feel that I should stick to DRY (Don’t Repeat Youself). No keeping multiple files in synch by hand. Experience has learned that this will never work. It’s quite simular to our current situation where we’re using Oracle Designer. The Designer Repository tends to get out of synch with the actual database every once in a while. This shouldn’t happen as each developer should make his changes in the Designer repository and then create the DDL scripts from there. You can write this in large letters on the wall, but Developers will just break the rule every now and then.

    I’ll have to go with the concept of a “patch DDL script” in which a Developer writes his/her own sequence of DDL statements for an individual change.

    I’m just not sure what Rob is doing with PL/SQL objects (procedures, functions, packages and triggers). You could also put them in a patch-file or you could re-create these objects whenever they’ve changed. If you use the latter option, you will change the DDL script that creates the PL/SQL object all the time. This gets you full version history of the PL/SQL code. I would like to treat PL/SQL code more like other 3GL source code (Java, C): update the same sourcefile all the time and compile/create the objects from there.

    If you put the PL/SQL objects in the “patch scripts” you will loose version history of PL/SQL objects.

    Another conclusion I made is that I need an automated patchrunner that detects if an individual patch has already been run against a database. I also want to include CRC checking in this. Whenever a patch is run against a database it is registered in a table within that database. I’ll also store a CRC value in this table. That way when I do an incremental build/install and I detect that a patch has already been run I can also check if the patch-file has been altered since it has been run. This is to detect a developer changing a patch-file after checking in a previous version of the same patch-file. If I then do an incremental build on a database I want to detect that the file has changed and abandon the build/install.

    Comment by Wilfred — 3 September 2005 #

  16. Hmmmm, CRC checking to ensure the integrity of past patches. I like it!

    Comment by Rob Baillie — 5 September 2005 #

  17. I have not come as far in the thoughts as you, put have thought of a simple way to handle table schema and data versioning, which is enough for my project’s needs. Feel free to comment on it: http://motinblog.blogspot.com/2005/10/version-control-of-database-data.html

    Comment by Motin — 12 October 2005 #

  18. Motin,

    I cannot comment on your blog without a blogger account which I don’t have.

    Your solution sounds great if you’re building a system that need to install on a clean environment and where you want to create a pre-seeded database. But what about upgrading an existing application to the latest version. You cannot drop and re-create tables there since you want to keep the data.

    Comment by Wilfred — 12 October 2005 #

  19. 1) Are there any standards or best practices that can be followed while designing the directory structure for the versioning of db objects?

    2) How can i extract the sql packages, procedures from the database into seperate files to store these first time into the version control pvcs? Is there any way to automate this rather than manually copying all the procs and packages from db to pvcs?
    Thanks
    Kaustubh

    Comment by Kaustubh — 30 May 2006 #

  20. Hi Kaustubh,

    I personally don’t know of any standards for directory structures for database objects. I know several books about version control systems propose directory structures, but they are more aimed at 3GL (e.g. C, Java) development.

    My personal opinion is to not overdo it with (sub)directories. I use the file extension to distinguish file types (.prc for procedures, .pkb for package bodies, etc). I tend to place them all in the same directory as I hate browsing though dozens of sub directories to get to all my files. I can group files per type by using the file extension. The only thing I did was create a directory per database-schema. This also allows my patch runner to determine the schema to use for running the scripts.

    As for extracting source from the database I recommend using the DBMS_METADATA package. Just look it up in the documentation. It can create CREATE statements for any database object and you’re guaranteed the CREATE statement can handle whatever (new) feature of the database object. This is very hard to accomplish if you try to extract your own CREATE statement by selecting the data dictionary views.

    Comment by Wilfred — 1 June 2006 #

  21. I was wondering if robust solution to version control of database objects in Oracle was developed? I have several Oracle developers working on database components for Java and .NET applications and would like to get these database objects in version control.

    Thanks,

    Brad

    Comment by Brad Smith — 21 February 2007 #

  22. Brad,

    Oracle does not have any solutions for this yet. With source control you’re best of with putting all database objects in text/SQL files and have those files under verison control.

    Comment by wilfred — 22 February 2007 #

  23. Hi Rob Bailie,

    I have been reading the BLOB here and would like to clarify a few things. My current role is to do product evaluation on version control tool. During this process, I discovered this web site and find it very interesting and useful. The organisation I work for is a big fan of Oracle database and this is the time for me to decide on how to check those objects and track them through the version control repository.

    My questions are:
    =================
    1. Do I have to create a big SQL script as a baseline DDL script to create all the database objects such as tables, triggers, indexes, etc? Please remember that these databases are already in production and concurrent developments are still going.
    2. Then, do I need to consolidate all the changes of “Altr script” into one SQL script to become the patch script?
    3. When you talk about the patch runner, do you have example and how do you implement that?

    Many thanks.

    Juny

    Comment by Juny Laksmana — 23 August 2007 #

  24. Juny,

    You better ask Rob himself at http://robertbaillie.blogspot.com/. He’s the one that already implemented all this stuff. We’re just thinking about it and having trouble to find the time to actually implement the stuff.

    My two cents:
    1. The primary reason for having baseline DDL scripts is to be able to create a database from scratch. This might be a requirement if you’re an ISV delivering software to your clients. For our own internal systems, this is not necessary. We only “upgrade” our database, never do a fresh install. Although it might be required for creating an empty test systems (but we don’t, we use and obfuscated copy of production)
    2. It’s up to you how many statements and which you combine in a single patch script. I guess that’s the power of the concept. It’s a developer who knows best which statements are required for a certain change. Sometimes there are two changes to a table, you might want to have in different scripts (e.g. they’re from unrelated issues). In other cases, you might one to combine alters on different tables since they have a relation (e.g. changing of a data structure between a master-detail with conversion of existing data).
    3. That’s really something for Rob. In our case it would be implemented as an Ant script that would run all the patch scripts in the indicated order. But it could just as well be implemented as a shell script, Perl, perhaps even a SQL*Plus script, or whatever language you feel comfortable with.

    Comment by wilfred — 23 August 2007 #

  25. Hi,

    we ship standard software to multiple sites and the software uses a repository with metadata (Invantive Producer) to transparently create the datamodel. If a customer skipped some releases, these changes are automatically merged and upgraded. However, our current problem is that we want the repository to be version controlled itself on a fine-grained level.

    For example:
    * In build 23 a table BUBS_GEBRUIKERS had 5 columns. This table is represented by a row in the repository and the columns by five rows in another table in the repository.
    * Then, in build 24 someone adds an extra column and changes an existing column. This is an extra row and a change to the column.
    * Build 23 was in meantime deployed to a customer. This customer requires another column to be added on top of build 23, leading to build 23.1.

    We have timetravelling functionality in the repository, so we can see what was changed when.

    How would you handle version management of such a repository, not using a coarse-grained solution as maintaining the whole repository or one application within it as a version?

    This would allow our developers to export/import an application onto the database on their laptop, make the changes onsite and merge them later into the maintenance build in the office and into the HEAD branch.

    Any pointers to PhD theses or so?

    Comment by Guido Leenders — 16 November 2007 #

  26. Guide; it looks like you thought things through thoroughly. What part of version management are you having trouble with?

    For me, the only difficult part is maintaining and deploying the necessary upgrade scripts to get from a specific version to another specific version. This is something different than storing the structure of the two particular versions. Some database structures (like tables) cannot be simply replaced with their new version since they contain data. You cannot simply drop and replace these objects as you can with most version controlled objects (database views, stored PL/SQL, compiled programs, etc).

    I think I understand your situation with a repository storing version X of your application definition. But what exactly is it that you have troubles with?

    Comment by wilfred — 18 November 2007 #

  27. Migrating the datamodel is not an issue,our infrastructure does that automatically and transparantly. However, we want to put the datamodel itself under version control.
    Each column (as stored in our own repository) for example should have a version number and a configuration of the system consists of several thousands of these elements.
    The same holds for attribute rules (say check constraints etc), tables, views, etc.

    All these items are stored in a repository similar to Oracle Designer. We want a consultant to be able to take the repository on the road, make changes off-site to the application structure in the repository and when back at the office, he merges his changes into our head branch or a maintenance branch.

    Comment by Guido Leenders — 19 November 2007 #

  28. Guido: I find it hard to believe you have a solution that can upgrade a datamodel completely automated. There are a huge number of different data migration scenarios you will have to tackle.

    What about removing a column from a table and creating a new child-table with that column. You will have to retain the data, so you would first need to create the new child table, copy the data from the old column to the new table, then drop the old column. To make things more complex, you could have database triggers with business constraints that need to be modified to this new structure as well. These will have to be disabled during this data migration.

    I can go on and on… We decided it’s impossible to handle all these migrations automatically by just storing the structure of the datamodel and generating migration scripts. You need developers to do this, and for most of them it’s relatively easy to write these migration scripts. You then need a framework to run the correct migration scripts depending on the current and desired version.

    But, I still don’t see your problem; you seem yo give the solution yourself; a consultant makes changes off-site, you keep track of these changes and merge them back to the central repository when back at the office. This sounds like a typical concurrent versioning solution. You can have a look at CVS and Subversion to see how they implemented this. This is very typical for large-group developments, like open source projects. Many developers get a local working copy and make local changes. When they’re ready to commit their work to the central repository, they first do an update to get all the changes in the central repository since their last update to their local working copy. This update can cause merge conflicts if the same pieces were updated in the local working copy and in the central repository. After resolving these merge conflicts, the developer can check in the local working copy to the central repository.

    The key is, you have a good mechanism to detect the changes in a local working copy. For most concurrent versioning tools, like CVS and Subversion this is text-file based and it uses standard algorithms to diff two text-files. Once this delta is known, you can apply the delta from the central repository to a local working copy (the update action) and you can also apply the delta from the local working copy to the central repository (the commit).

    Things get more complicated if you want to support tagging and branching, but it’s doable. I suggest you take a look at Subversion at http://subversion.tigris.org/ and the associated book at http://svnbook.red-bean.com/

    You should be able to borrow their ideas for you own solution.

    Comment by wilfred — 23 November 2007 #

  29. Regarding our current situation: we solved it by maintaining a conversion script in our metadata for every change of column (the only really persistent part of the datadictionary). This one is executed if necessary. The rest of the objects (code etc.) are always overwritten by the new version. Is similar to the Oracle Apps approach, with little extensions.

    Regarding version management: cvs is already in place, but our problem is not to manage text files but to manage relationships between text files. This is somewhat similar to the problem CVS is having with file renames: the old and new file have a relation. For the Linux kernel development and subversion these systems support this.
    But our problem is little bit larger, the relationships have more meanings. In the past, I’ve used Echo/2000 for Designer, worked fine once bugfree, but a … in … to keeps things synchronized. Now we have to develop something similar in functionality, but there are little to no theoretical articles on how to handle this.

    Comment by Guido Leenders — 23 November 2007 #

  30. Hi,

    I see you are an acquaintance of Andrej. Plz say hi to him if you see him again! I was at Eurotransplant a few times in ‘95 or ‘96 when I worked for Redwood. Andrej and I worked on a project in Lux.

    Comment by Guido Leenders — 23 November 2007 #

  31. Guido,

    I don’t know of any official academic papers on the subject, sorry. I might see Andrej tomorrow at AMIS. If so, I’ll let him know.

    Comment by wilfred — 27 November 2007 #

  32. Hi,

    We actually produce a software product that solves just this problem. It allows you to create testable, reusable scripts that force your database to grow along a single upgrade path.

    Using this tool, we are able to create a single script that is used in all tests that touch the database (acceptance, database unit, tests for transitions from one version to the next) as well as on all of our databases (dev, test, production).

    Check it out at the following link:

    Hexagon Software DataConstructor

    If you have any questions feel free to contact me directly at max [at] hexsw [dot] com.

    Thanks,
    Max Guernsey, III

    Comment by Max Guernsey, III — 15 January 2008 #

  33. Special care is needed for merging these changes back to the mainline and still have the automatic build script be able to deploy travesti it to a database. It has to be able to deploy the SQL file to a database which already has the 2.5.1 patch installed as this will be the case on the final deployment to production.

    Comment by cafer — 26 August 2009 #

  34. Hey Guys,

    Check out TeamWork from dbMaestro (www.dbmaestro.com). This is a source-control product for database objects. it handles many of the problems and pain-points mentioned in this thread, including (but not limited to) version-history on objects (and data), diffs, deployment issues and impact-analysis.

    Comment by Tal — 26 October 2009 #

Trackback/Pingbacks:

  1. Bobablog

    The Database Patch Runner After posting my comment on here, Amihay Gonen asked a few questions, basically trying to get a bit more detail. So I've put a more thorough explanation of the patch runner we use onto my own blog. I hope it's of use...

  2. Bobablog

    The Database Patch Runner - Rollbacks In response to the post listed above, Andrew Beacock asked a seemingly innocuous question and I thought people here might be interested in the response... How do you rollback a failed patch?

  3. OraTransplant » Database agility

    [...] I wrote a blog entry about version controlling database objects last week. This started some interesting discussions with other Oracle professionals and made Robert Baillie  write down some interesting entries at his own blog. [...]

  4. Bobablog

    Database Upgrade Scripts – Why all the effort? Chuck's question: "How much effort do you dedicate to total automation when you still need someone smart enough to [deal with failures]" led me to think about that for a bit. Why do we put this effort in? What do we hope to gain? Hopefully, from goi...

  5. Bobablog

    The Database Patch Runner: Dealing with code Previously I've talked about the database patch runner as a way of easily and consistently getting database changes into version control, onto databases and under automated frameworks. But I glossed over dealing with source code...

  6. Bobablog

    Database Patch Runner: Design by Contract So let's say that you've managed to put together a build script that will install the latest version of your database with the minimum of work and you've got your developers using the build to upgrade their own workspaces. How do you take that a ste...

  7. shomphe.com » Blog Archive » Version Control of database objects

    [...] http://www.oratransplant.nl/2005/08/03/version-control-of-database-objects/ [...]

  8. A Few More Words » Agile Database - Part 2: The release problem research

    [...] Version control of database objects - a long and interesting dicussion talking about a big change log solution. An initial set of considerations: “Track the end state of the object or modifications?”, “Support for automatic builds”, and “Support for branching and merging”. One of the comments describes and interesting approach, to comment out the previous DDL in the source file, so it is still there, but only the new migration command will be executed. The concept of being able to generate whatever artifacts you may be interested in is also mentioned (for example, if you want a clean table creation script for a table, or history of changes for one table, go generate it - don’t repeat yourself in the version control system). [...]

  9. IT-eye Weblog » Running sqlplus scripts in Ant

    [...] It’s always important to remember that you’re developing software that needs to be installed by one or more customers. Too often developers will just create objects in a development database using a tool like toad, without thinking about how it will be shipped and installed. By using a continuous integration build process and creating your database from scripts you are forced to think about this during development. More on this subject can be found on Wilfred’s weblog. He had a very interesting post about version control for database objects last year, which also discusses automatic builds for database objects. [...]

  10. Abnormal Form

    Database Build Management... Despite being a dual role production/development DBA, one of my job roles is build manager and source code manager. I recently migrated our source control system from Microsoft SourceSafe to Subversion, a decision that I am largely satisfied with, thou...

  11. Cj6a Mp3 Category, Cj6 Part Manufactured

    [...] 1234567891011121314151617181920212223242526272829303132Cj6a Mp3 Category, Cj6 Part ManufacturedCj6a Mp3 Category, Cj6 Part Manufacturedcj6a price aircraft for sale control surfaces cj6a fuzhou cj6 manual cj6a parts 5b made cj6a sleeve cj6 timeline used jeep cj6a tuxedo park jeepster commando cj6a piano score aircraft cj6a mp3 category tires jeep cj6a cj6a bumper automotive jeep cj5 cj6a chopper cj6a used yak 18 aerobatic aircraft chinese cj6 episode air cj6a military aircraft flight cj6a part aircraft yak 18 landing gear cj6a last minute hot air ballooning cj6a aftermarket automotive parts high quality ignition coil cj6a zone mothers day gifts flying cj6a prices low cj6a new auto cj6a second hand cj 5 aircraft cj6a promotion plane cj6a pt jeeps cj6 car jeep wagoneer cj6 cost alternative cj6 part info cj6a promo cj6 army jeep cj6a london cj6a manual motorcycles cj6a zone g forces cj6a farms cj6a used shanghai radial engine cj6a tv rv jeep cj6 episode rear axle cj6 headlight rectangular jeep comanche cj6a power pilots cj6a aircraft cj6a coupon combat flight yak 52 cj6a part classification civil air patrol cj6a carpet cj6a 2nd hand cj 5 produced part number jeep cj6 cj6a used cj 7 training cj6a coolant custom cj6a pt travis afb cj6 coast guard cj6a sale 1967 jeep cj5 cj6a pt category mcguire afb 1969 cj6 1975 jeep cj6 antique sort cj6a cj6a 2nd hand yak 18a clearance jeep cj6 cj6a bulb structure cj6a 2nd hand chinese yak 18 cj6 coupon fits cj6a evaluation cj6a episodes snj 5 cj6a performance radial engine landing gear cj6a sell military aircraft nanchang cj6a headlight distributor high intensity discharge cj6a headlight automotive wholesale cj6a boot low cj6a cj6a manual scrambler jeep grand cherokee cj6 substitute omix ada best buy cj6a 1959 willys jeep cj6 1969 jeep cj6 auto parts performance j20 parts jeep cj6 cj6a old airplane aerobatics cj6a headlight automotive grand cherokee laredo jeep cj6 cj6 auction jeep cj5 cj6a review cj6a yak 11 11fr cj6a hard cj6a common chinese warbird cj6a yak 18 chinese decided cj6a generator cj6a seats week cj6a sell china imported cj6a accessories year cj6a performance military aircraft yak 52 cj6a tires fit cj6a part helicopter cj6a body kit grand wagoneer daewoo lanos cj6a repair car truck shop cj6a replacement pep boys cj6a tyra banks cj6 discount body parts cj6a apartments barwon heads cj6a kanji cj6a mix cj6 pt chinese aircraft cj6a horse cj6a article cj6a free id cj6 sale australia jeep cj7 1956 F100 Parts | Mountaineer Always Free | 1991 Used Dodge | Sapporo Nyc Menu | 450sel Model Mercedes | [...]

Sorry, the comment form is closed at this time.

Powered by WordPress with Pool theme design by Borja Fernandez.
Entries and comments feeds.