World.update gets stuck

Posts regarding potential bugs, enhancement requests, and general feedback on use of dyn4j
soylomass
Posts: 13
Joined: Fri Nov 18, 2016 3:40 pm

World.update gets stuck

Postby soylomass » Fri Nov 18, 2016 3:51 pm

Hi, I'm using dy4n serverside. Basically, I declare it in the main thread, and set some settings:

Code: Select all

world = new World();
world.setBounds(null);
Sap sap = new Sap<>();
world.setBroadphaseDetector(sap);
contactHandler = new ContactHandler(this, world);
world.addListener(contactHandler);
world.getSettings().setMaximumTranslation(15);
world.getSettings().setStepFrequency(3/60f);


Then, I create a ScheduledExecutorService and call it's scheduleAtFixedRate method to run the game loop:

Code: Select all

gameLoopExecutor.scheduleAtFixedRate(new Runnable() {
          double ns = 1000000000.0 / 60.0;
         
          double delta = 0;
          long timer = System.currentTimeMillis();
         
          int socketTicks = 0;
          int foodTicks = 0;
          int physicsTicks = 0;
          int infoTick = 0;
          float physTook = 0;
          float playersTook = 0;
          float foodTook = 0;
         
         @Override
         public void run() {
            try {
                 long now = System.nanoTime();
                 delta += (now - lastTime) / ns;
                 if(delta > 9)
                    System.out.println("REDUCED delta time");
                 delta = Math.min(delta, 9);
                
                 lastTime = now;
                 System.out.println();
                 System.out.print("[<]");
      
                 while (delta >= 1) {
                    physicsTicks ++;
                    socketTicks ++;
                    foodTicks ++;
                    infoTick ++;
                   
                    [b]if(physicsTicks >= 3){
                        physicsTicks = 0;
                       long start_time = Config.debug || infoTick >= 60 ? System.nanoTime() : 0;
      
                       if(infoTick >= 60)
                          System.out.println("Evaluating phys time");
                       
                       try {
                          System.out.print("[ph]");
                          System.out.println("st:"+world.update(3/60f,1));
                       }catch(Throwable e){
                          System.out.println("ERROR AT PHYS");
                          e.printStackTrace();
                       }[/b]
                       
                        if(Config.debug || infoTick >= 60){
                          long end_time = System.nanoTime();
                          double difference = (end_time - start_time)/1e6;
                          if(Config.debug){
                             System.out.println("Physics took "+difference+"ms");
                          }
                          
                          physTook = (float) difference;
                          
                        }
                    }
                   
                    if(socketTicks >= 3){
                       long start_time = Config.debug || infoTick >= 60 ? System.nanoTime() : 0;
      
                       if(infoTick >= 60)
                          System.out.println("Evaluating players time");
                       
                       try {
                          System.out.print("[pl]");
                         playersHandler.update((float) (socketTicks * ns) / 1000000);
                       }catch(Throwable e){
                          System.out.println("ERROR AT PLAYERS");
                          e.printStackTrace();
                       }
      
                        if(Config.debug || infoTick >= 60){
                          long end_time = System.nanoTime();
                          double difference = (end_time - start_time)/1e6;
                          if(Config.debug){
                              System.out.println("Players took "+difference+"ms");
                          }
                          
                          playersTook = (float) difference;
                        }
                      socketTicks = 0;
                    }
                   
                    if(foodTicks >= 3){
                       long start_time = Config.debug || infoTick >= 60 ? System.nanoTime() : 0;
      
                       if(infoTick >= 60)
                          System.out.println("Evaluating food time");
                       
                       try {
                          System.out.print("[fo]");
                          foodHandler.update((float) (foodTicks * ns) / 1000000);
                       }catch(Throwable e){
                          System.out.println("ERROR AT FOOD");
                          e.printStackTrace();
                       }
                       try {
                          System.out.print("[fl]");
                          floor.update((float) (foodTicks * ns) / 1000000);
                       }catch(Throwable e){
                          System.out.println("ERROR AT TERRAIN");
                          e.printStackTrace();
                       }
      
                        if(Config.debug || infoTick >= 60){
                          long end_time = System.nanoTime();
                          double difference = (end_time - start_time)/1e6;
                          if(Config.debug){
                              System.out.println("Food took "+difference+"ms");
                          }
                          
                          foodTook = (float) difference;
                        }
                        foodTicks = 0;
                    }
                   
                    if(infoTick >= 60){
                       System.out.println("The loop is looping - phys "+physTook+" players "+playersTook+" food "+foodTook+" total "+(physTook+playersTook+foodTook));
                       infoTick = 0;
                    }
                   
                     delta--;
                 }
                 System.out.print("[>]");
                 if(delta == 0){
                    System.out.println("delta is 0");
                 }
             }catch(Throwable e){
                System.out.println("FATAL ERROR AT GAME LOOP");
                 e.printStackTrace();
             }
         }
      },0,1000/60,TimeUnit.MILLISECONDS) ;


Everything works fine, until at some moment, and randomly (can happen with 20 people online or with 100), the process gets stuck in World.update. It doesn't return anything:

[<][ph]st:true
[pl][fo][fl][>]
[<][>]
[<][>]
[<][ph]st:true
[pl][fo][fl][>]
[<][>]
[<][>]
[<][ph]st:true
[pl][fo][fl][>]
[<][>]
[<][>]
[<][>]
[<][ph]st:true
[pl][fo][fl][>]
[<][>]
[<][>]
[<][ph] (Here world.update is called, and as you see, it doesn't return)
game server 6547 has 15 users
Server 0 irresponsive. ses shutdown? false, terminated? false (Here I check if the Scheduler has crashed, but not)
closed connection 7b35ead3-57cd-43e6-9ea0-9c8bc5a23ff3
closed connection a4d34079-9fa9-41df-a5a9-8a0c5ba919e4
closed connection 0bc80ee2-e23c-49a9-8c7d-b2c4b93ec8c9



What could cause the world to become irresponsive (or take an exagerated amount of time to respond)? The player handler and food handler create new bodies and apply forces, always after the world update.

I hit a roadblock so I'm desperate. The only thing left to try is to change dyn4j for jBox2D, but I don't want as I love dyn4j.

soylomass
Posts: 13
Joined: Fri Nov 18, 2016 3:40 pm

Re: World.update gets stuck

Postby soylomass » Fri Nov 18, 2016 4:01 pm

Should I try to add prints on dyn4 update code to see where exactly it got stuck? (what's the best way to do that?)

--

I use SAP BroadphaseDetector because the "Performance" docs said it's better if the app is not Raycast intensive, and indeed in works faster, but I don't think it could be related to the problem, right?

Also, the docs should be updated, because they refer to SAPIncremental and BruteForce which don't exist anymore (only Sap and DynamicAABBTree in the latest version)

soylomass
Posts: 13
Joined: Fri Nov 18, 2016 3:40 pm

Re: World.update gets stuck

Postby soylomass » Sat Nov 19, 2016 12:44 am

Update: I've commented out

Sap sap = new Sap<>();
world.setBroadphaseDetector(sap);

So as to use the default Broadphase (dynamicaabbtree) and now the server has been up and running for more than an hour (more than it would with SAP).

It's a good news (I'll confirm in a few hours anyway) but the bad news is that Sap is much faster with 50 to 100 dynamic bodies and almost 1800 static bodies, so if someone has any idea of why it might be freezing with Sap, I'd love to hear it.

William
Site Admin
Posts: 349
Joined: Sat Feb 06, 2010 10:23 pm

Re: World.update gets stuck

Postby William » Sat Nov 19, 2016 8:39 am

Interesting... Is there any way to dump the state of the world to make a replication case at the time it hangs?

Either way let me look at SAP to see if there's anything obvious. It doesn't see as much testing as DynamicAABBTree.

William

William
Site Admin
Posts: 349
Joined: Sat Feb 06, 2010 10:23 pm

Re: World.update gets stuck

Postby William » Sat Nov 19, 2016 9:30 am

Doing some testing with Math.random()-ly placed bodies (1800 static & 100 dynamic) I can't get it to halt completely. It's slow in the beginning while it ramps up the memory but otherwise not too bad. I also found two other things:

  1. Sap uses dyn4j's BinarySearchTree class, but doesn't tell it be balanced (I need to fix this). This could make the tree degenerate into a list after so many updates. This didn't happen in my test, but it's possible. In my test the depth of the tree was 24. When I turned the automatic balancing the depth was 13. The depth of the tree directly effects performance. My test may not be representative of yours since all my bodies were uniform in size.
  2. On a few runs of my test, it actually ran out of memory. Can you confirm that you are setting the max memory high enough?

William

soylomass
Posts: 13
Joined: Fri Nov 18, 2016 3:40 pm

Re: World.update gets stuck

Postby soylomass » Sat Nov 19, 2016 10:08 am

Sadly, the same happened with DynamicAABBTree., Just took more time.
The memory usage is low, around 15%. Although I didn't check when the update halted.

Also, i created a server locally with 600 bots and the world worked fine. So the first one would think is maybe in some IO handler i touch the world in a way i shouldn't, but not, I only update the world in PlayerHandler and FoodHandler, run after world.update.

All the bodies have a fixed mass of 1 (i set it with Body.SetMass).

It's the most unexplainable problem I've had ever.


---
William wrote:
  1. On a few runs of my test, it actually ran out of memory. Can you confirm that you are setting the max memory high enough?


What max memory setting are you talking about? Java's? In the server I run the jar in a 64bit Debian computer with 4GB of RAM, without any memory parameters, only "java -jar ..."
Last edited by soylomass on Sat Nov 19, 2016 10:33 am, edited 1 time in total.

soylomass
Posts: 13
Joined: Fri Nov 18, 2016 3:40 pm

Re: World.update gets stuck

Postby soylomass » Sat Nov 19, 2016 10:21 am

The difference is that this time I added System.out.print inside the world.step code:

[pl][fo][fl][>]
[<][ph]<abcdefghij>st:true
[pl][fo][fl][>]
[<][>]
[<][>]
[<][>]
[<][>]
[<][>]
[<][>]
[<][>]
[<][>]
[<][>]
[<][ph]<abcdefghiclosed conn...


It stopped before j (this.detect();)

Code: Select all

      // notify the step listener
      for (int i = 0; i < sSize; i++) {
         StepListener sl = stepListeners.get(i);
         sl.postSolve(this.step, this);
      }
      System.out.print("i");
      
      // after all has been updated find new contacts
      // this is done so that the user has the latest contacts
      // and the broadphase has the latest AABBs, etc.
      this.detect();
      System.out.print("j");


This could be random, but maybe isn't? I just checked once.

soylomass
Posts: 13
Joined: Fri Nov 18, 2016 3:40 pm

Re: World.update gets stuck

Postby soylomass » Sat Nov 19, 2016 12:06 pm

UPDATE:

I changed the run command to: java -d64 -Xms512m -Xmx4g -jar ...

As I read, it means run it in 64bit mode, with a minimum of 512MB and a max of 4GB. I don't know if it the 64bit mode or the memory ones, but the server has been running with Sap broadphase for more than 1 hour, and it also shows a much lower CPU usage in linux "top" command.

I'll report in a few more hours.

Btw, the server is running here, if you are curious about what is it about. If you remove the "/beta" from the URL you'll play in the original server, made with node.js (and therefore starts to lag with less than 40 people playing).

zoom
Posts: 143
Joined: Sun Mar 17, 2013 3:57 pm
Location: Stockholm, Sweden
Contact:

Re: World.update gets stuck

Postby zoom » Sat Nov 19, 2016 3:02 pm

You really should look into JXM and JConsole (or JVisualVM or similar). They are invaluable in these scenarios.
One thing that would fit the symptom (but it can of course be something else) is that you had to small max heap (-Xmx) meaning that as memory need increased the JVM spent more and more time trying to perform garbage collects, eventually it was so little free memory it spent most of its time in garbage collection. Somehting you would have seen clearly if you hooked up JConsole and plotted the memory consumption.

soylomass
Posts: 13
Joined: Fri Nov 18, 2016 3:40 pm

Re: World.update gets stuck

Postby soylomass » Sat Nov 19, 2016 3:20 pm

UPDATE: Even with the new run line, after a few hours, the server still got stuck at:

Code: Select all

 this.detect();


I'm really hopeless now.

---

zoom wrote:You really should look into JXM and JConsole (or JVisualVM or similar). They are invaluable in these scenarios.
One thing that would fit the symptom (but it can of course be something else) is that you had to small max heap (-Xmx) meaning that as memory need increased the JVM spent more and more time trying to perform garbage collects, eventually it was so little free memory it spent most of its time in garbage collection. Somehting you would have seen clearly if you hooked up JConsole and plotted the memory consumption.


I will try that, because aside that the only left way is to try another physics engine. Thanks.

--

I cannot use jConsole in my debian server because it doesn't have a gui installed, but I hooked in jvmtop and restarted the server, for now it says:

JvmTop 0.8.0 alpha - 20:34:15, amd64, 2 cpus, Linux 4.8.3-x86, load avg 0.95
http://code.google.com/p/jvmtop

PID MAIN-CLASS HPCUR HPMAX NHCUR NHMAX CPU GC VM USERNAME #T DL
23752 server.jar 51m 878m 30m n/a 23.28% 0.94% O8U11 root 18
23776 m.jvmtop.JvmTop 17m 878m 19m n/a 0.48% 0.00% O8U11 root 14


It's strange that it says Max Heap 878m if I set 4GB in the run command, I'll check why this happens.


------------


UPDATE 2:

It seems I didn't execute the jar with that command before, as I used the "restart" command in my script, which had a different launch command.


Return to “Bugs, Enhancements, Feedback”

Who is online

Users browsing this forum: No registered users and 1 guest