This was very close to a worst-case scenario: We spent months building this ERP system. All of our testing passed fine. ERP system went live and the dust is settling. Then, out of the blue, AX starts freezing up. Mostly, this was in the accounting screens. So, at least it only affected our most demanding users and not everybody.
When I went into one record in the AP journal, I would get this message: Cannot create a record in Number sequence list (NumberSequenceList). … The record already exists. On another record in AP, I could get into it, but when I clicked any buttons, AX Client would freeze. I had to kill the process to exit AX.
Things that were NOT wrong: I was the Admin, I was logged on the AOS machine, SQL Server was fine, all of the data was fine, not every screen was doing this (CRM was unaffected), this only was happening in one company, and I could not recreate the problem in the dev or test environments. I tried viewing the following logs, which were no help: Windows Event Log, AX logs and SQL Logs.
Naturally, I started binging/googling the problem. I found an article from Paul Steynberg that sounded like he had similar problems. His advice didn’t solve my locking issue, but it did get me started down the right path.
It seems as if the table SysUtilElementsLog is some kind of table locking/latching mechanism that was added in AX 2009 (possibly for helping manage resources shared by load-balanced servers or something). This table contains a list of the tables (presumably, the ones that would have a lot of contention and would need to be managed via some locking scheme). The most popular tables on the list contained the highest numbers. I tried truncating the values (stop AOS, backup the db, set “UseCount =1” for all rows, start AOS). That seemed to help for me, but not the other users. #fail.
Then, I thought, maybe it was a permission issue. (try….) Nope. #fail.
I did notice that, every time I restarted the AOS, I could get in and do stuff, but it always started by giving me a message (ONCE) about Cannot create a record in Number sequence list (NumberSequenceList)… I went into Basic, Number Sequence, and did a cleanup of the sequence in question and the error went away. It seemed odd that it kept coming up every time I restarted the AOS. It seemed like that sequence wasn’t incrementing or it got incremented but not saved to the database.
Several articles that I read, talked about solving lock ups/freeze ups by monitoring the locks via SQL server. Well, that was a jim-dandy idea, except I found that there were a bunch of locks in SQL (well, duh) and no advice about what to do with them. Gah! Thanks for nothing!
I did some more looking around in AX until I found the following tables:
– NumberSequenceList – the settings for all number sequences
– NumberSequenceTable – the state & locking table for number sequences
– NumberSequenceTTS – the transaction table for number sequence locks *** eureka! ***
If you query SQL server with the following query:
SELECT * FROM NumberSequenceList WHERE DataAreaID = ‘[company with the locking problem]’ AND TransID > 0
This will show you the tables with open/active transactions. Any one of these may be locking things up.
You really want to keep an eye on records with a [Status] = 1. This means that the record is in use and may be blocking someone else from using it.
Compare the results of that query to this one:
SELECT * FROM NumberSequenceTTS WHERE DataAreaID = ‘[company with the locking problem]’
EVERY record in this table represents an open transaction and may result in a lock that blocks other users or freezes up the AX Client for people.
It is particularly bad if there is a row in the NumberSequenceList that does not have a matching record in the NumberSequenceTTS table (matched on TransID). This would mean there is a lock in one table and no reference by which to remove that lock. Basically an orphaned lock. Pure evil!
It just happened that my orphaned record was on the NumberSequenceList but not the NumberSequenceTTS table.
The way I resolved the problem and cleared the log jam was to :
1. Stop the AOS
2. Backup the SQL database
3. Removed the orphaned record from the NumberSequenceList Table
4. Start the AOS
All of the other records that were in the NumberSequenceTTS table were, apparently, waiting on the orphaned record to clear, because they all got processed and cleared out.
The system was working, no lockups or freeze ups. Everyone stopped bleeding from their eyes, the sun came out and birds sang.
*deep, cleansing breath ….and let it out…. aaaaaaah*