What are the options for storing hierarchical data in a relational database?

Questions : What are the options for storing hierarchical data in a relational database?

Good Overviews

Generally speaking, you’re making a decision between fast read times (for example, nested set) or fast write times (adjacency list). Usually, you end up with a combination of the options below that best fit your needs. The following provides some in-depth reading:

Options

Ones I am aware of and general features:

  1. Adjacency List:
  • Columns: ID, ParentID
  • Easy to implement.
  • Cheap node moves, inserts, and deletes.
  • Expensive to find the level, ancestry & descendants, path
  • Avoid N+1 via Common Table Expressions in databases that support them
  1. Nested Set (a.k.a Modified Preorder Tree Traversal)
  • Columns: Left, Right
  • Cheap ancestry, descendants
  • Very expensive O(n/2) moves, inserts, deletes due to volatile encoding
  1. Bridge Table (a.k.a. Closure Table /w triggers)
  • Uses separate join table with ancestor, descendant, depth (optional)
  • Cheap ancestry and descendants
  • Writes costs O(log n) (size of the subtree) for insert, updates, deletes
  • Normalized encoding: good for RDBMS statistics & query planner in joins
  • Requires multiple rows per node
  1. Lineage Column (a.k.a. Materialized Path, Path Enumeration)
  • Column: lineage (e.g. /parent/child/grandchild/etc…)
  • Cheap descendants via prefix query (e.g. LEFT(lineage, #) = '/enumerated/path')
  • Writes costs O(log n) (size of the subtree) for insert, updates, deletes
  • Non-relational: relies on Array datatype or serialized string format
  1. Nested Intervals
  • Like nested set, but with real/float/decimal so that the encoding isn’t volatile (inexpensive move/insert/delete)
  • Has real/float/decimal representation/precision issues
  • Matrix encoding variant adds ancestor encoding (materialized path) for “free”, but with the added trickiness of linear algebra.
  1. Flat Table
  • A modified Adjacency List that adds a Level and Rank (e.g. ordering) column to each record.
  • Cheap to iterate/paginate over
  • Expensive move and delete
  • Good Use: threaded discussion – forums / blog comments
  1. Multiple lineage columns
  • Columns: one for each lineage level, refers to all the parents up to the root, levels down from the item’s level are set to NULL
  • Cheap ancestors, descendants, level
  • Cheap insert, delete, move of the leaves
  • Expensive insert, delete, move of the internal nodes
  • Hard limit to how deep the hierarchy can be

Database Specific Notes

MySQL

Oracle

PostgreSQL

SQL Server

  • General summary
  • 2008 offers HierarchyId data type that appears to help with the Lineage Column approach and expand the depth that can be represented.

Total Answers: 7 Answers 7


Popular Answers:

  1. This design was not mentioned yet:

    Multiple lineage columns

    Though it has limitations, if you can bear them, it’s very simple and very efficient. Features:

    • Columns: one for each lineage level, refers to all the parents up to the root, levels below the current items’ level are set to 0 (or NULL)
    • There is a fixed limit to how deep the hierarchy can be
    • Cheap ancestors, descendants, level
    • Cheap insert, delete, move of the leaves
    • Expensive insert, delete, move of the internal nodes

    Here follows an example – taxonomic tree of birds so the hierarchy is Class/Order/Family/Genus/Species – species is the lowest level, 1 row = 1 taxon (which corresponds to species in the case of the leaf nodes):

    CREATE TABLE `taxons` ( `TaxonId` smallint(6) NOT NULL default '0', `ClassId` smallint(6) default NULL, `OrderId` smallint(6) default NULL, `FamilyId` smallint(6) default NULL, `GenusId` smallint(6) default NULL, `Name` varchar(150) NOT NULL default '' ); 

    and the example of the data:

    +---------+---------+---------+----------+---------+-------------------------------+ | TaxonId | ClassId | OrderId | FamilyId | GenusId | Name | +---------+---------+---------+----------+---------+-------------------------------+ | 254 | 0 | 0 | 0 | 0 | Aves | | 255 | 254 | 0 | 0 | 0 | Gaviiformes | | 256 | 254 | 255 | 0 | 0 | Gaviidae | | 257 | 254 | 255 | 256 | 0 | Gavia | | 258 | 254 | 255 | 256 | 257 | Gavia stellata | | 259 | 254 | 255 | 256 | 257 | Gavia arctica | | 260 | 254 | 255 | 256 | 257 | Gavia immer | | 261 | 254 | 255 | 256 | 257 | Gavia adamsii | | 262 | 254 | 0 | 0 | 0 | Podicipediformes | | 263 | 254 | 262 | 0 | 0 | Podicipedidae | | 264 | 254 | 262 | 263 | 0 | Tachybaptus | 

    This is great because this way you accomplish all the needed operations in a very easy way, as long as the internal categories don’t change their level in the tree.

  2. If your database supports arrays, you can also implement a lineage column or materialized path as an array of parent ids.

    Specifically with Postgres you can then use the set operators to query the hierarchy, and get excellent performance with GIN indices. This makes finding parents, children, and depth pretty trivial in a single query. Updates are pretty manageable as well.

    I have a full write up of using arrays for materialized paths if you’re curious.

  3. This is really a square peg, round hole question.

    If relational databases and SQL are the only hammer you have or are willing to use, then the answers that have been posted thus far are adequate. However, why not use a tool designed to handle hierarchical data? Graph database are ideal for complex hierarchical data.

    The inefficiencies of the relational model along with the complexities of any code/query solution to map a graph/hierarchical model onto a relational model is just not worth the effort when compared to the ease with which a graph database solution can solve the same problem.

    Consider a Bill of Materials as a common hierarchical data structure.

    class Component extends Vertex { long assetId; long partNumber; long material; long amount; }; class PartOf extends Edge { }; class AdjacentTo extends Edge { }; 

    Shortest path between two sub-assemblies: Simple graph traversal algorithm. Acceptable paths can be qualified based on criteria.

    Similarity: What is the degree of similarity between two assemblies? Perform a traversal on both sub-trees computing the intersection and union of the two sub-trees. The percent similar is the intersection divided by the union.

    Transitive Closure: Walk the sub-tree and sum up the field(s) of interest, e.g. “How much aluminum is in a sub-assembly?”

    Yes, you can solve the problem with SQL and a relational database. However, there are much better approaches if you are willing to use the right tool for the job.

  4. I am using PostgreSQL with closure tables for my hierarchies. I have one universal stored procedure for the whole database:

    CREATE FUNCTION nomen_tree() RETURNS trigger LANGUAGE plpgsql AS $_$ DECLARE old_parent INTEGER; new_parent INTEGER; id_nom INTEGER; txt_name TEXT; BEGIN -- TG_ARGV[0] = name of table with entities with PARENT-CHILD relationships (TBL_ORIG) -- TG_ARGV[1] = name of helper table with ANCESTOR, CHILD, DEPTH information (TBL_TREE) -- TG_ARGV[2] = name of the field in TBL_ORIG which is used for the PARENT-CHILD relationship (FLD_PARENT) IF TG_OP = 'INSERT' THEN EXECUTE 'INSERT INTO ' || TG_ARGV[1] || ' (child_id,ancestor_id,depth) SELECT $1.id,$1.id,0 UNION ALL SELECT $1.id,ancestor_id,depth+1 FROM ' || TG_ARGV[1] || ' WHERE child_id=$1.' || TG_ARGV[2] USING NEW; ELSE -- EXECUTE does not support conditional statements inside EXECUTE 'SELECT $1.' || TG_ARGV[2] || ',$2.' || TG_ARGV[2] INTO old_parent,new_parent USING OLD,NEW; IF COALESCE(old_parent,0) <> COALESCE(new_parent,0) THEN EXECUTE ' -- prevent cycles in the tree UPDATE ' || TG_ARGV[0] || ' SET ' || TG_ARGV[2] || ' = $1.' || TG_ARGV[2] || ' WHERE id=$2.' || TG_ARGV[2] || ' AND EXISTS(SELECT 1 FROM ' || TG_ARGV[1] || ' WHERE child_id=$2.' || TG_ARGV[2] || ' AND ancestor_id=$2.id); -- first remove edges between all old parents of node and its descendants DELETE FROM ' || TG_ARGV[1] || ' WHERE child_id IN (SELECT child_id FROM ' || TG_ARGV[1] || ' WHERE ancestor_id = $1.id) AND ancestor_id IN (SELECT ancestor_id FROM ' || TG_ARGV[1] || ' WHERE child_id = $1.id AND ancestor_id <> $1.id); -- then add edges for all new parents ... INSERT INTO ' || TG_ARGV[1] || ' (child_id,ancestor_id,depth) SELECT child_id,ancestor_id,d_c+d_a FROM (SELECT child_id,depth AS d_c FROM ' || TG_ARGV[1] || ' WHERE ancestor_id=$2.id) AS child CROSS JOIN (SELECT ancestor_id,depth+1 AS d_a FROM ' || TG_ARGV[1] || ' WHERE child_id=$2.' || TG_ARGV[2] || ') AS parent;' USING OLD, NEW; END IF; END IF; RETURN NULL; END; $_$; 

    Then for each table where I have a hierarchy, I create a trigger

    CREATE TRIGGER nomenclature_tree_tr AFTER INSERT OR UPDATE ON nomenclature FOR EACH ROW EXECUTE PROCEDURE nomen_tree('my_db.nomenclature', 'my_db.nom_helper', 'parent_id'); 

    For populating a closure table from existing hierarchy I use this stored procedure:

    CREATE FUNCTION rebuild_tree(tbl_base text, tbl_closure text, fld_parent text) RETURNS void LANGUAGE plpgsql AS $$ BEGIN EXECUTE 'TRUNCATE ' || tbl_closure || '; INSERT INTO ' || tbl_closure || ' (child_id,ancestor_id,depth) WITH RECURSIVE tree AS ( SELECT id AS child_id,id AS ancestor_id,0 AS depth FROM ' || tbl_base || ' UNION ALL SELECT t.id,ancestor_id,depth+1 FROM ' || tbl_base || ' AS t JOIN tree ON child_id = ' || fld_parent || ' ) SELECT * FROM tree;'; END; $$; 

    Closure tables are defined with 3 columns – ANCESTOR_ID, DESCENDANT_ID, DEPTH. It is possible (and I even advice) to store records with same value for ANCESTOR and DESCENDANT, and a value of zero for DEPTH. This will simplify the queries for retrieval of the hierarchy. And they are very simple indeed:

    -- get all descendants SELECT tbl_orig.*,depth FROM tbl_closure LEFT JOIN tbl_orig ON descendant_id = tbl_orig.id WHERE ancestor_id = XXX AND depth <> 0; -- get only direct descendants SELECT tbl_orig.* FROM tbl_closure LEFT JOIN tbl_orig ON descendant_id = tbl_orig.id WHERE ancestor_id = XXX AND depth = 1; -- get all ancestors SELECT tbl_orig.* FROM tbl_closure LEFT JOIN tbl_orig ON ancestor_id = tbl_orig.id WHERE descendant_id = XXX AND depth <> 0; -- find the deepest level of children SELECT MAX(depth) FROM tbl_closure WHERE ancestor_id = XXX; 
  5. ‘This sure seems like it has been over-complicated. From my experience, there are just three key things to get Excel to close properly:

    1: make sure there are no remaining references to the excel application you created (you should only have one anyway; set it to null)

    2: call GC.Collect()

    3: Excel has to be closed, either by the user manually closing the program, or by you calling Quit on the Excel object. (Note that Quit will function just as if the user tried to close the program, and will present a confirmation dialog if there are unsaved changes, even if Excel is not visible. The user could press cancel, and then Excel will not have been closed.)

    1 needs to happen before 2, but 3 can happen anytime.

    One way to implement this is to wrap the interop Excel object with your own class, create the interop instance in the constructor, and implement IDisposable with Dispose looking something like

    That will clean up excel from your program’s side of things. Once Excel is closed (manually by the user or by you calling Quit) the process will go away. If the program has already been closed, then the process will disappear on the GC.Collect() call.

    (I’m not sure how important it is, but you may want a GC.WaitForPendingFinalizers() call after the GC.Collect() call but it is not strictly necessary to get rid of the Excel process.)

    This has worked for me without issue for years. Keep in mind though that while this works, you actually have to close gracefully for it to work. You will still get accumulating excel.exe processes if you interrupt your program before Excel is cleaned up (usually by hitting “stop” while your program is being debugged).’

  6. Here is a really easy way to do it:

    [DllImport("User32.dll")] static extern uint GetWindowThreadProcessId(IntPtr hWnd, out int lpdwProcessId); ... int objExcelProcessId = 0; Excel.Application objExcel = new Excel.Application(); GetWindowThreadProcessId(new IntPtr(objExcel.Hwnd), out objExcelProcessId); Process.GetProcessById(objExcelProcessId).Kill(); 
  7. My answer is late and its only purpose is to support the solution proposed by Govert.

    Short version:

    • Write a local function with no global variables and no arguments executing the COM stuff.

    • Call the COM function in a wrapping function that calls the COM function and cleans thereafter.

    Long version:

    You are not using .Net to count references of COM objects and to release them yourself in the correct order. Even C++ programmers don’t do that any longer by using smart pointers. So, forget about Marshal.ReleaseComObject and the funny one dot good two dots bad rule. The GC is happy to do the chore of releasing COM objects if you null out all references to COM objects that are no longer needed. The easiest way is to handle COM objects in a local function, with all variables for COM objects naturally going out of scope at the end. Due to some strange features of the debugger pointed out in the brilliant answers of Hans Passant mentioned in the accepted answers Post Mortem, the cleanup should be delegated to a wrapping function that also calls the executing function. So, COM objects like Excel or Word need two functions, one that does the actual job and a wrapper that calls this function and calls the GC afterwards like Govert did, the only correct answer in this thread. To show the principle I use a wrapper suitable for all functions doing COM stuff. Except for this extension, my code is just the C# version of Govert’s code. In addition, I stopped the process for 6 seconds so that you can check out in the Task Manager that Excel is no longer visible after Quit() but lives on as a zombie until the GC puts an end to it.

    using Excel = Microsoft.Office.Interop.Excel; public delegate void WrapCom(); namespace GCTestOnOffice{ class Program{ static void DoSomethingWithExcel(){ Excel.Application ExcelApp = new(); Excel.Workbook Wb = ExcelApp.Workbooks.Open(@"D:\Sample.xlsx"); Excel.Worksheet NewWs = Wb.Worksheets.Add(); for (int i = 1; i < 10; i++){ NewWs.Cells[i, 1] = i;} Wb.Save(); ExcelApp.Quit(); } static void TheComWrapper(WrapCom wrapCom){ wrapCom(); //All COM objects are out of scope, ready for the GC to gobble //Excel is no longer visible, but the process is still alive, //check out the Task-Manager in the next 6 seconds Thread.Sleep(6000); GC.Collect(); GC.WaitForPendingFinalizers(); GC.Collect(); GC.WaitForPendingFinalizers(); //Check out the Task-Manager, the Excel process is gone } static void Main(string[] args){ TheComWrapper(DoSomethingWithExcel); } } } 
  8. Just to add another solution to the many listed here, using C++/ATL automation (I imagine you could use something similar from VB/C#??)

    Excel::_ApplicationPtr pXL = ... : SendMessage ( ( HWND ) m_pXL->GetHwnd ( ), WM_DESTROY, 0, 0 ) ; 

    This works like a charm for me…

  9. [DllImport("user32.dll")] private static extern uint GetWindowThreadProcessId(IntPtr hWnd, out uint lpdwProcessId);
  10. So far it seems all answers involve some of these:

    1. Kill the process
    2. Use GC.Collect()
    3. Keep track of every COM object and release it properly.

    Which makes me appreciate how difficult this issue is 🙂

    I have been working on a library to simplify access to Excel, and I am trying to make sure that people using it won’t leave a mess (fingers crossed).

    Instead of writing directly on the interfaces Interop provides, I am making extension methods to make live easier. Like ApplicationHelpers.CreateExcel() or workbook.CreateWorksheet(“mySheetNameThatWillBeValidated”). Naturally, anything that is created may lead to an issue later on cleaning up, so I am actually favoring killing the process as last resort. Yet, cleaning up properly (third option), is probably the least destructive and most controlled.

    So, in that context I was wondering whether it wouldn’t be best to make something like this:

    public abstract class ReleaseContainer<T> { private readonly Action<T> actionOnT; protected ReleaseContainer(T releasible, Action<T> actionOnT) { this.actionOnT = actionOnT; this.Releasible = releasible; } ~ReleaseContainer() { Release(); } public T Releasible { get; private set; } private void Release() { actionOnT(Releasible); Releasible = default(T); } } 

    I used ‘Releasible’ to avoid confusion with Disposable. Extending this to IDisposable should be easy though.

    An implementation like this:

    public class ApplicationContainer : ReleaseContainer<Application> { public ApplicationContainer() : base(new Application(), ActionOnExcel) { } private static void ActionOnExcel(Application application) { application.Show(); // extension method. want to make sure the app is visible. application.Quit(); Marshal.FinalReleaseComObject(application); } } 

    And one could do something similar for all sorts of COM objects.

    In the factory method:

     public static Application CreateExcelApplication(bool hidden = false) { var excel = new ApplicationContainer().Releasible; excel.Visible = !hidden; return excel; } 

    I would expect that every container will be destructed properly by the GC, and therefore automatically make the call to Quit and Marshal.FinalReleaseComObject.

    Comments? Or is this an answer to the question of the third kind?

  11. There i have an idea,try to kill the excel process you have opened:

    1. before open an excelapplication,get all the process ids named oldProcessIds.
    2. open the excelapplication.
    3. get now all the excelapplication process ids named nowProcessIds.
    4. when need to quit,kill the except ids between oldProcessIds and nowProcessIds.

      private static Excel.Application GetExcelApp() { if (_excelApp == null) { var processIds = System.Diagnostics.Process.GetProcessesByName("EXCEL").Select(a => a.Id).ToList(); _excelApp = new Excel.Application(); _excelApp.DisplayAlerts = false; _excelApp.Visible = false; _excelApp.ScreenUpdating = false; var newProcessIds = System.Diagnostics.Process.GetProcessesByName("EXCEL").Select(a => a.Id).ToList(); _excelApplicationProcessId = newProcessIds.Except(processIds).FirstOrDefault(); } return _excelApp; } public static void Dispose() { try { _excelApp.Workbooks.Close(); _excelApp.Quit(); System.Runtime.InteropServices.Marshal.ReleaseComObject(_excelApp); _excelApp = null; GC.Collect(); GC.WaitForPendingFinalizers(); if (_excelApplicationProcessId != default(int)) { var process = System.Diagnostics.Process.GetProcessById(_excelApplicationProcessId); process?.Kill(); _excelApplicationProcessId = default(int); } } catch (Exception ex) { _excelApp = null; } } 
  12. Tested with Microsoft Excel 2016

    A really tested solution.

    To C# Reference please see: https://stackoverflow.com/a/1307180/10442623

    To VB.net Reference please see: https://stackoverflow.com/a/54044646/10442623

    1 include the class job

    2 implement the class to handle the apropiate dispose of excel proces

  13. I had this same problem getting PowerPoint to close after newing up the Application object in my VSTO AddIn. I tried all the answers here with limited success.

    This is the solution I found for my case – DONT use ‘new Application’, the AddInBase base class of ThisAddIn already has a handle to ‘Application’. If you use that handle where you need it (make it static if you have to) then you don’t need to worry about cleaning it up and PowerPoint won’t hang on close.

  14. Of the three general strategies considered in other answers, killing the excel process is clearly a hack, whereas invoking the garbage collector is a brutal shotgun approach meant to compensate for incorrect deallocation of COM-objects. After lots of experimentation and rewriting the management of COM objects in my version-agnostic and late-bound wrapper, I have come to the conclusion that accurate and timely invocations of Marshal.ReleaseComObject() is the most efficient and elegant strategy. And no, you do not ever need FinalReleaseComObject(), because in a well-writtin program each COM acquired on once and therefore requires a single decrement of the reference counter.

    One shall make sure to release every single COM object, preferably as soon as it is no longer needed. But it is perfectly possible to release everything right after quitting the Excel application, at the only expense of higher memory usage. Excel will close as expected as long as one does not loose or forget to release a COM object.

    The simplest and most obvious aid in the process is wrapping every interop object into a .NET class implementing IDisposable, where the Dispose() method invokes ReleaseComObject() on its interop object. Doing it in the destructor, as proposed in here, makes no sense because destructors are non-deterministic.

    Show below is our wrapper’s method that obtains a cell from WorkSheet bypassing the intermediate Cells member. Notice the way it disposes of the intermediate object after use:

    public ExcelRange XCell( int row, int col) { ExcelRange anchor, res; using( anchor = Range( "A1") ) { res = anchor.Offset( row - 1, col - 1 ); } return res; } 

    The next step may be a simple memory manager that will keep track of every COM object obtained and make sure to release it after Excel quits if the user prefers to trade some RAM usage for simpler code.

    Futher reading

    1. How to properly release Excel COM objects,
    2. Releasing COM objects: Garbage Collector vs. Marshal.RelseaseComObject.
  15. I really like when things clean up after them selves… So I made some wrapper classes that do all the cleanup for me! These are documented further down.

    The end code is quite readable and accessible. I haven’t yet found any phantom instances of Excel running after I Close() the workbooks and Quit() the application (besides where I debug and close the app mid process).

    function void OpenCopyClose() { var excel = new ExcelApplication(); var workbook1 = excel.OpenWorkbook("C:Tempfile1.xslx", readOnly: true); var readOnlysheet = workbook1.Worksheet("sheet1"); var workbook2 = excel.OpenWorkbook("C:Tempfile2.xslx"); var writeSheet = workbook.Worksheet("sheet1"); // do all the excel manipulation // read from the first workbook, write to the second workbook. var a1 = workbook1.Cells[1, 1]; workbook2.Cells[1, 1] = a1 // explicit clean-up workbook1.Close(false); workbook2 .Close(true); excel.Quit(); } 

    Note: You can skip the Close() and Quit() calls but if you are writing to an Excel document you will at least want to Save(). When the objects go out of scope (the method returns) the class finalizers will automatically kick in and do any cleanup. Any references to COM objects from the Worksheet COM object will automatically be managed and cleaned up as long as you are careful with the scope of your variables, eg keep variables local to the current scope only when storing references to COM objects. You can easily copy values you need to POCOs if you need, or create additional wrapper classes as discussed below.

    To manage all this, I have created a class, DisposableComObject, that acts as a wrapper for any COM object. It implements the IDisposable interface and also contains a finalizer for those that don’t like using.

    The Dispose() method calls Marshal.ReleaseComObject(ComObject) and then sets the ComObjectRef property to null.

    The object is in a disposed state when the private ComObjectRef property is null.

    If the ComObject property is accessed after being disposed, a ComObjectAccessedAfterDisposeException exception is thrown.

    The Dispose() method can be called manually. It is also called by the finalizer, at the conclusion of a using block, and for using var at the conclusion of the scope of that variable.

    The top level classes from Microsoft.Office.Interop.Excel, Application, Workbook, and Worksheet, get their own wrapper classes where each are subclasses of DisposableComObject

    Here is the code:

    /// <summary> /// References to COM objects must be explicitly released when done. /// Failure to do so can result in odd behavior and processes remaining running after the application has stopped. /// This class helps to automate the process of disposing the references to COM objects. /// </summary> public abstract class DisposableComObject : IDisposable { public class ComObjectAccessedAfterDisposeException : Exception { public ComObjectAccessedAfterDisposeException() : base("COM object has been accessed after being disposed") { } } /// <summary>The actual COM object</summary> private object ComObjectRef { get; set; } /// <summary>The COM object to be used by subclasses</summary> /// <exception cref="ComObjectAccessedAfterDisposeException">When the COM object has been disposed</exception> protected object ComObject => ComObjectRef ?? throw new ComObjectAccessedAfterDisposeException(); public DisposableComObject(object comObject) => ComObjectRef = comObject; /// <summary> /// True, if the COM object has been disposed. /// </summary> protected bool IsDisposed() => ComObjectRef is null; public void Dispose() { Dispose(true); GC.SuppressFinalize(this); // in case a subclass implements a finalizer } /// <summary> /// This method releases the COM object and removes the reference. /// This allows the garbage collector to clean up any remaining instance. /// </summary> /// <param name="disposing">Set to true</param> protected virtual void Dispose(bool disposing) { if (!disposing || IsDisposed()) return; Marshal.ReleaseComObject(ComObject); ComObjectRef = null; } ~DisposableComObject() { Dispose(true); } } 

    There is also a handy generic subclass which makes usage slightly easier.

    public abstract class DisposableComObject<T> : DisposableComObject { protected new T ComObject => (T)base.ComObject; public DisposableComObject(T comObject) : base(comObject) { } } 

    Finally, we can use DisposableComObject<T> to create our wrapper classes for the Excel interop classes.

    The ExcelApplication subclass has a reference to a new Excel application instance and is used to open workbooks.

    OpenWorkbook() returns an ExcelWorkbook which is also a subclass of DisposableComObject.

    Dispose() has been overridden to quit the Excel application before calling the base Dispose() method. Quit() is an alias of Dispose().

    public class ExcelApplication : DisposableComObject<Application> { public class OpenWorkbookActionCancelledException : Exception { public string Filename { get; } public OpenWorkbookActionCancelledException(string filename, COMException ex) : base($"The workbook open action was cancelled. {ex.Message}", ex) => Filename = filename; } /// <summary>The actual Application from Interop.Excel</summary> Application App => ComObject; public ExcelApplication() : base(new Application()) { } /// <summary>Open a workbook.</summary> public ExcelWorkbook OpenWorkbook(string filename, bool readOnly = false, string password = null, string writeResPassword = null) { try { var workbook = App.Workbooks.Open(Filename: filename, UpdateLinks: (XlUpdateLinks)0, ReadOnly: readOnly, Password: password, WriteResPassword: writeResPassword, ); return new ExcelWorkbook(workbook); } catch (COMException ex) { // If the workbook is already open and the request mode is not read-only, the user will be presented // with a prompt from the Excel application asking if the workbook should be opened in read-only mode. // This exception is raised when when the user clicks the Cancel button in that prompt. throw new OpenWorkbookActionCancelledException(filename, ex); } } /// <summary>Quit the running application.</summary> public void Quit() => Dispose(true); /// <inheritdoc/> protected override void Dispose(bool disposing) { if (!disposing || IsDisposed()) return; App.Quit(); base.Dispose(disposing); } } 

    ExcelWorkbook also subclasses DisposableComObject<Workbook> and is used to open worksheets.

    The Worksheet() methods returns ExcelWorksheet which, you guessed it, is also an subclass of DisposableComObject<Workbook>.

    The Dispose() method is overridden and fist closes the worksheet before calling the base Dispose().

    NOTE: I’ve added some extension methods which is uses to iterate over Workbook.Worksheets. If you get compile errors, this is why. Ill add the extension methods at the end.

    public class ExcelWorkbook : DisposableComObject<Workbook> { public class WorksheetNotFoundException : Exception { public WorksheetNotFoundException(string message) : base(message) { } } /// <summary>The actual Workbook from Interop.Excel</summary> Workbook Workbook => ComObject; /// <summary>The worksheets within the workbook</summary> public IEnumerable<ExcelWorksheet> Worksheets => worksheets ?? (worksheets = Workbook.Worksheets.AsEnumerable<Worksheet>().Select(w => new ExcelWorksheet(w)).ToList()); private IEnumerable<ExcelWorksheet> worksheets; public ExcelWorkbook(Workbook workbook) : base(workbook) { } /// <summary> /// Get the worksheet matching the <paramref name="sheetName"/> /// </summary> /// <param name="sheetName">The name of the Worksheet</param> public ExcelWorksheet Worksheet(string sheetName) => Worksheet(s => s.Name == sheetName, () => $"Worksheet not found: {sheetName}"); /// <summary> /// Get the worksheet matching the <paramref name="predicate"/> /// </summary> /// <param name="predicate">A function to test each Worksheet for a macth</param> public ExcelWorksheet Worksheet(Func<ExcelWorksheet, bool> predicate, Func<string> errorMessageAction) => Worksheets.FirstOrDefault(predicate) ?? throw new WorksheetNotFoundException(errorMessageAction.Invoke()); /// <summary> /// Returns true of the workbook is read-only /// </summary> public bool IsReadOnly() => Workbook.ReadOnly; /// <summary> /// Save changes made to the workbook /// </summary> public void Save() { Workbook.Save(); } /// <summary> /// Close the workbook and optionally save changes /// </summary> /// <param name="saveChanges">True is save before close</param> public void Close(bool saveChanges) { if (saveChanges) Save(); Dispose(true); } /// <inheritdoc/> protected override void Dispose(bool disposing) { if (!disposing || IsDisposed()) return; Workbook.Close(); base.Dispose(disposing); } } 

    Finally, the ExcelWorksheet.

    UsedRows() simply returns an enumerable of unwrapped Microsoft.Office.Interop.Excel.Range objects. I haven’t yet encountered a situation where COM objects accessed from properties of the Microsoft.Office.Interop.Excel.Worksheet object need to manually wrapped like was needed with Application, Workbook, and Worksheet. These all seem to clean them selves up automatically. Mostly, I was just iterating over Ranges and getting or setting values, so my particular use-case isn’t as advanced as the available functionality.

    There is no override of Dispose() in this case as no special action needs to take place for worksheets.

    public class ExcelWorksheet : DisposableComObject<Worksheet> { /// <summary>The actual Worksheet from Interop.Excel</summary> Worksheet Worksheet => ComObject; /// <summary>The worksheet name</summary> public string Name => Worksheet.Name; // <summary>The worksheets cells (Unwrapped COM object)</summary> public Range Cells => Worksheet.Cells; public ExcelWorksheet(Worksheet worksheet) : base(worksheet) { } /// <inheritdoc cref="WorksheetExtensions.UsedRows(Worksheet)"/> public IEnumerable<Range> UsedRows() => Worksheet.UsedRows().ToList(); } 

    It is possible to add even more wrapper classes. Just add additional methods to ExcelWorksheet as needed and return the COM object in a wrapper class. Just copy what we did when wrapping the workbook via ExcelApplication.OpenWorkbook() and ExcelWorkbook.WorkSheets.

    Some useful extension methods:

    public static class EnumeratorExtensions { /// <summary> /// Converts the <paramref name="enumerator"/> to an IEnumerable of type <typeparamref name="T"/> /// </summary> public static IEnumerable<T> AsEnumerable<T>(this IEnumerable enumerator) { return enumerator.GetEnumerator().AsEnumerable<T>(); } /// <summary> /// Converts the <paramref name="enumerator"/> to an IEnumerable of type <typeparamref name="T"/> /// </summary> public static IEnumerable<T> AsEnumerable<T>(this IEnumerator enumerator) { while (enumerator.MoveNext()) yield return (T)enumerator.Current; } /// <summary> /// Converts the <paramref name="enumerator"/> to an IEnumerable of type <typeparamref name="T"/> /// </summary> public static IEnumerable<T> AsEnumerable<T>(this IEnumerator<T> enumerator) { while (enumerator.MoveNext()) yield return enumerator.Current; } } public static class WorksheetExtensions { /// <summary> /// Returns the rows within the used range of this <paramref name="worksheet"/> /// </summary> /// <param name="worksheet">The worksheet</param> public static IEnumerable<Range> UsedRows(this Worksheet worksheet) => worksheet.UsedRange.Rows.AsEnumerable<Range>(); } 
  16. Excel is not designed to be programmed via C++ or C#. The COM API is specifically designed to work with Visual Basic, VB.NET, and VBA.

    Also all the code samples on this page are not optimal for the simple reason that each call must cross a managed/unmanaged boundary and further ignore the fact that the Excel COM API is free to fail any call with a cryptic HRESULT indicating the RPC server is busy.

    The best way to automate Excel in my opinion is to collect your data into as big an array as possible / feasible and send this across to a VBA function or sub (via Application.Run) which then performs any required processing. Furthermore – when calling Application.Run – be sure to watch for exceptions indicating excel is busy and retry calling Application.Run.

  17. This is the only way that really works for me

     foreach (Process proc in System.Diagnostics.Process.GetProcessesByName("EXCEL")) { proc.Kill(); } 

Tasg: sql, database

Answer Link
jidam
  • Unable to run NoraUI mvn verify goal
  • Unable to run my app on emulator in VS Code
  • Unable to run multiple instances of libVLC(MobileVLCKit) in IOS via flutter framework
  • Unable to run make on griddb source on ubuntu 20.04 (building from source)
  • Unable to run latexindent macOS Monterey 12.0.1
  • Unable to run kotlinc-native command
  • Unable to run JUnit Test… Java.lang.ExceptionInInitializerError (Android Studio)
  • Unable to run java with -Xmx > 966m
  • Unable to run ionic cap run android from wsl2 inorder to start android emulator
  • Unable to run Intel HAXM installer: Cannot start process, the working directory does not exist
  • fs
  • Unable to run Google Analytics sample code
  • unable to run flutter run after upgarding to flutter 2.8.0 from 2.5.3
  • Unable to run Django with PostgreSQL in Docker
  • Unable to Run Container Using testcontainers
  • Unable to run ClojureScript Hello World program, Error building classpath. Error reading edn.
  • unable to run client command for apache karaf 4.3.3 through remote server
  • Unable to run c program 2nd time using eclipse
  • unable to run c++ in visual studio code on m1 chipset
  • Unable to run Android Instrumented Tests
  • Unable to run adb, check your Android SDK installation and ANDROID_SDK_ROOT environment variable: …AndroidSdkplatform-toolsadb.exe
  • Unable to run a singlespecific .spec.ts file through angular cli using ng test –include option
  • Unable to run a Mango query
  • Unable to return response back to view in laravel from package
  • Unable to return object reference in std::optional
  • Unable to return NULL in a function that expects an integer return type
  • Unable to return correct change in JavaScript Cash Register
  • Unable to retrieve version information from Elasticsearch nodes. Request timed out
  • Unable to retrieve values from Axios Response data
  • Unable to retrieve dotenv JWT secret Error: secretOrPrivateKey must have a value
  • Unable to resolve your shell environment
  • Unable to resolve token for FCM while implementing Push notification for Xamarin
  • Unable to resolve the request yii
  • Unable to resolve service for type Swashbuckle.AspNetCore.Swagger.ISwaggerProvider
  • Unable to resolve service for type Microsoft.EntityFrameworkCore.Diagnostics.IDiagnosticsLogger