1. Caused by: java.net.SocketException: Software caused connection abort: recv failed
To fix this issue just replace usage of PhantomJSDriver with following FixedPhantomJSDriver:
public class FixedPhantomJSDriver extends PhantomJSDriver { private final int retryCount = 2; public FixedPhantomJSDriver() { } public FixedPhantomJSDriver(Capabilities desiredCapabilities) { super(desiredCapabilities); } public FixedPhantomJSDriver(PhantomJSDriverService service, Capabilities desiredCapabilities) { super(service, desiredCapabilities); } @Override protected Response execute(String driverCommand, Map<String, ?> parameters) { int retryAttempt = 0; while (true) { try { return super.execute(driverCommand, parameters); } catch (UnreachableBrowserException e) { retryAttempt++; if (retryAttempt > retryCount) { throw e; } } } } }So in summary:
org.openqa.selenium.remote.UnreachableBrowserException: Error communicating with the remote browser. It may have died.
caused by:
Caused by: java.net.SocketException: Software caused connection abort: recv failed
(the recv failed is important)
happens (from my investigations) when connection is closed prematurely. In our case it seemed to be caused by ssl certificate handling in java (I'm still investigating this) and is extremely random. Luckily all http traffic is handled by the execute method. So by overriding it and adding simple retry functionality you provide a working workaround solution (it helped us on our project as we never had a failing/flaky tests again).
Although this is a specific implementation for PhantomJS driver, the same approach should work for other drivers as well.
2. Caused by: java.net.SocketTimeoutException: Read timed out
But there is still chance that you have different symptoms and your browser just hangs for some time until it throws:
Caused by: java.net.SocketTimeoutException: Read timed out
If you're working on a Windows machine, the chances are that you've reached limit of possible open connections. This normally happens, because Selenium creates a lot of connections and Windows keeps them opened/cached even when java triggered a close connection command.
To fix this issue you need to change Windows registry values under:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parametersyou need to set/create two DWORD values:
MaxUserPort = 32768 TcpTimedWaitDelay = 30MaxUserPort will increase the limit of possible open connections (you can select any value between 5000-65534, the higher the better). TcpTimedWaitDelay makes sure that windows will close stale connections (already closed by java) after 30 seconds (can't be set to lower, but without setting this the default value is 4 minutes !!!). In most cases this should fix your "hang" issue.
3. When your test still hangs for 3 hours until it fails
Unfortunately there is still a small chance that you have issues where your test will get stuck for 3 hours !!! The reason for this is that the HttpClientFactory in selenium has hardcoded socket timeout to 3 hours, and although there is a proposed fix to the selenium core code, until it will be accepted there is no way how to change it using normal means. For those who unfortunatelly must bear the pain, here is how you use my hacky, but working workaround to this problem:
public class FixExample { public static void main(String[] args) { // this is my custom workaround HttpParamsSetter.setSoTimeout(60 * 1000); // set socket timeout to 1 minute // and here goes your custom code WebDriver driver = new PhantomJSDriver(); ... driver.quit(); } }Here you can find the code that does the magic (this works because fields HttpCommandExecutor.httpClientFactory and HttpClientFactory.client are static fields that are initialized only once):
import org.apache.http.impl.client.DefaultHttpClient; import org.apache.http.params.HttpConnectionParams; import org.apache.http.params.HttpParams; import org.openqa.selenium.remote.HttpCommandExecutor; import org.openqa.selenium.remote.internal.HttpClientFactory; import java.lang.reflect.Field; public class HttpParamsSetter { @SuppressWarnings("deprecation") public static void setSoTimeout(int soTimeout) { HttpClientFactory factory = getStaticValue(HttpCommandExecutor.class, "httpClientFactory"); if (factory == null) { factory = new HttpClientFactory(); } DefaultHttpClient httpClient = (DefaultHttpClient) factory.getHttpClient(); HttpParams params = httpClient.getParams(); HttpConnectionParams.setSoTimeout(params, soTimeout); httpClient.setParams(params); setStaticValue(HttpCommandExecutor.class, "httpClientFactory", factory); } private static <T> T getStaticValue(Class<?> aClass, String fieldName) { Field field = null; Boolean isAccessible = null; try { field = aClass.getDeclaredField(fieldName); isAccessible = field.isAccessible(); field.setAccessible(true); return (T) field.get(null); } catch (NoSuchFieldException e) { throw new RuntimeException(e); } catch (IllegalAccessException e) { throw new RuntimeException(e); } finally { if (field != null && isAccessible != null) { field.setAccessible(isAccessible); } } } private static void setStaticValue(Class<HttpCommandExecutor> aClass, String fieldName, Object value) { Field field = null; Boolean isAccessible = null; try { field = aClass.getDeclaredField(fieldName); isAccessible = field.isAccessible(); field.setAccessible(true); field.set(null, value); } catch (NoSuchFieldException e) { throw new RuntimeException(e); } catch (IllegalAccessException e) { throw new RuntimeException(e); } finally { if (field != null && isAccessible != null) { field.setAccessible(isAccessible); } } } }