AX freeze up, “Cannot create a record in Number sequence list (NumberSequenceList)”

This was very close to a worst-case scenario:  We spent months building this ERP system.  All of our testing passed fine.  ERP system went live and the dust is settling.  Then, out of the blue, AX starts freezing up.  Mostly, this was in the accounting screens.  So, at least it only affected our most demanding users and not everybody.
When I went into one record in the AP journal, I would get this message: Cannot create a record in Number sequence list (NumberSequenceList). …  The record already exists.  On another record in AP, I could get into it, but when I clicked any buttons, AX Client would freeze.  I had to kill the process to exit AX.
 
Things that were NOT wrong: I was the Admin, I was logged on the AOS machine, SQL Server was fine, all of the data was fine, not every screen was doing this (CRM was unaffected), this only was happening in one company, and I could not recreate the problem in the dev or test environments.  I tried viewing the following logs, which were no help: Windows Event Log, AX logs and SQL Logs.
 
Naturally, I started binging/googling the problem.  I found an article from Paul Steynberg that sounded like he had similar problems.  His advice didn’t solve my locking issue, but it did get me started down the right path.
 
It seems as if the table SysUtilElementsLog is some kind of table locking/latching mechanism that was added in AX 2009 (possibly for helping manage resources shared by load-balanced servers or something).  This table contains a list of the tables (presumably, the ones that would have a lot of contention and would need to be managed via some locking scheme).  The most popular tables on the list contained the highest numbers.  I tried truncating the values (stop AOS, backup the db, set “UseCount =1” for all rows, start AOS).  That seemed to help for me, but not the other users.  #fail.
 
Then, I thought, maybe it was a permission issue.  (try….) Nope. #fail.
 
I did notice that, every time I restarted the AOS, I could get in and do stuff, but it always started by giving me a message (ONCE) about Cannot create a record in  Number sequence list (NumberSequenceList)…  I went into Basic, Number Sequence, and did a cleanup of the sequence in question and the error went away.  It seemed odd that it kept coming up every time I restarted the AOS.  It seemed like that sequence wasn’t incrementing or it got incremented but not saved to the database.
 
Several articles that I read, talked about solving lock ups/freeze ups by monitoring the locks via SQL server.  Well, that was a jim-dandy idea, except I found that there were a bunch of locks in SQL (well, duh) and no advice about what to do with them.  Gah!  Thanks for nothing!
 
I did some more looking around in AX until I found the following tables:
NumberSequenceList – the settings for all number sequences
NumberSequenceTable – the state & locking table for number sequences
NumberSequenceTTS – the transaction table for number sequence locks *** eureka! ***
 
If you query SQL server with the following query:
    SELECT * FROM NumberSequenceList WHERE DataAreaID = ‘[company with the locking problem]’ AND TransID > 0
This will show you the tables with open/active transactions.  Any one of these may be locking things up.
You really want to keep an eye on records with a [Status] = 1.  This means that the record is in use and may be blocking someone else from using it.
 
Compare the results of that query to this one:
    SELECT * FROM NumberSequenceTTS WHERE DataAreaID = ‘[company with the locking problem]’
EVERY record in this table represents an open transaction and may result in a lock that blocks other users or freezes up the AX Client for people.
It is particularly bad if there is a row in the NumberSequenceList that does not have a matching record in the NumberSequenceTTS table (matched on TransID).  This would mean there is a lock in one table and no reference by which to remove that lock.  Basically an orphaned lock.  Pure evil!
 
It just happened that my orphaned record was on the NumberSequenceList but not the NumberSequenceTTS table.
The way I resolved the problem and cleared the log jam was to :
1. Stop the AOS
2. Backup the SQL database
3. Removed the orphaned record from the NumberSequenceList Table
4. Start the AOS
 
All of the other records that were in the NumberSequenceTTS table were, apparently, waiting on the orphaned record to clear, because they all got processed and cleared out.
The system was working, no lockups or freeze ups.  Everyone stopped bleeding from their eyes, the sun came out and birds sang.
*deep, cleansing breath ….and let it out…. aaaaaaah*
 
 
Advertisements

About Tim Golisch

I'm a geek. I do geeky things.
This entry was posted in Dynamics AX, Errors, IT Horror Stories. Bookmark the permalink.

6 Responses to AX freeze up, “Cannot create a record in Number sequence list (NumberSequenceList)”

  1. wow! That’s is the most artistic description of AX locking issues I’ve encountered. Carry on!

  2. Ivo Tonev says:

    Thanks, very helpful indeed!

  3. Cago says:

    Hi

    We had recently trouble that you are describing in this blog.
    Your info really did help us to solve the issue. Thanks.

    The thing I would like to ask is how to prevent it? We did have same
    situation like 3 times in 2 months, and our customer is not happy
    about that as you can imagine.

    Did you do any research on the cause of such behaviour?
    Thanks for your reply.

    • Tim Golisch says:

      As far as I could tell, these orphaned locks occurred
      1. When the AOT crashed (un-expectedly)
      2. SQL Server crashed/rebooted/restarted while the AOT was running

      There is no reliable way to prevent the AOT or SQL from crashing.
      However, here are a few other things you can do to prevent orphaned locks:
      1. Always stop the AOT before servicing SQL server
      2. Strongly encourage people to log out of AX when they are away from their computers. Because if they have a screen open that is holding a lock, and you have to restart the AOT (or it crashes), this could orphan a lock.

      You might want to consider running a SQL job to look for orphaned locks and send an email to an admin if it finds one.

      Let me know if you see anything other than what I have found.

      Good luck sir!

  4. Mark says:

    We experienced this and other odd behavior when some clients were not on same app or kernal versions w/ rest of company.

  5. pablocp16 says:

    I experienced this, but your solution is not enough. I also deleted all numbersequencelist where status = 0 and then the solution was complete.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s