The real improvement comes when you adjust the update rate for each column to match the length of the track they have to sweep out. Instead of wrapping around when you reach the last column, you wrap around when your counter reaches the square of the number of columns, and you update the column corresponding to the integer square root of the counter. This gets rid of the bright dense region in the middle, and adds more updates out at the edges making them less sparse.
In practice it's complicated by the fact that these panels update two lines at once. Every time you update a column in the outer half, you're also updating one in the inner half. I couldn't find a simple procedural update strategy to spread these evenly, so I ended up generating a lookup table for it using simulated annealing.