Speed Optimization
Want to run hundreds of browsers at once? You're in the right place. Surfsky handles up to 1,000 concurrent browsers (depending on your plan) without breaking a sweat. Our Kubernetes infrastructure and Chromium optimizations mean you get speed AND stability at scale.
Every browser runs in complete isolation with its own fingerprint. Anti-bot systems can't link them together unless your automation patterns give you away. This design gives you:
- True isolation - Each browser is completely independent
- Efficient resources - Optimized RAM and CPU usage at the Chromium level
- Rock-solid stability - Kubernetes keeps everything running smoothly
- Easy scaling - Go from 10 browsers to 100 with a config change
Quick Example
Need specific code examples for your use case? Contact us and we'll help you get started.
Here's basic parallel processing in action:
from dataclasses import dataclass
from core.executor import TaskExecutor
from core.config import ExecutorConfig
from core.pipeline.types import BaseTask
from core.pipeline.result import Result
@dataclass
class ScrapedData:
title: str
status: int
class WebScraper(BaseTask):
async def main(self, browser, url) -> Result[ScrapedData]:
try:
async with browser.managed_page() as page:
response = await page.goto(url)
title = await page.title()
return Result.success(ScrapedData(
title=title,
status=response.status
))
except Exception as e:
return Result.failure(f"Error: {str(e)}")
async def main():
# Set up parallel execution
config = ExecutorConfig(
browser_count=10, # Run 10 browsers in parallel
max_browser_tasks=5, # Recycle browser after 5 tasks
max_task_attempts=3, # Retry failed tasks up to 3 times
fingerprint={"os": "mac"} # Browser fingerprint settings
)
# Create scraper and executor
scraper = WebScraper()
executor = TaskExecutor(config)
# URLs to process
urls = [
"https://example1.com",
"https://example2.com",
# ... hundreds more
]
# Run everything in parallel
results, metrics = await executor.execute(urls, scraper)
Optimization Tips
Browser Pool Management
config = ExecutorConfig(
browser_count=10, # Start with 10, increase based on your limits
max_browser_tasks=5, # Fresh browser every 5 tasks prevents memory issues
max_browser_attempts=3, # Retry browser crashes
task_timeout=30, # Kill stuck tasks after 30 seconds
attempt_delay=2 # Wait 2 seconds between retries
)
Smart Proxy Rotation
def get_proxies(count: int) -> list[str]:
countries = ["US", "UK", "DE", "FR"]
return [
f"socks5://user:[email protected]:1080?country={country}"
for country in countries
]
config = ExecutorConfig(
browser_count=10,
proxies=get_proxies(20), # Keep proxy pool bigger than browser count
)
Resilient Error Handling
class ResilientScraper(BaseTask):
async def main(self, browser, url) -> Result:
try:
async with browser.managed_page() as page:
for attempt in range(3):
try:
await page.goto(url, timeout=10000)
return Result.success(await self.extract_data(page))
except Exception as e:
if attempt == 2:
raise
await asyncio.sleep(2)
except Exception as e:
return Result.failure(str(e))
Watch Your Limits
Rate Limiting
We track your usage with headers:
x-ratelimit-limit: 200 # Max requests/minute
x-ratelimit-limit-hour: 3000 # Max requests/hour
x-ratelimit-remaining: 198 # Requests left this minute
x-ratelimit-remaining-hour: 2998 # Requests left this hour
Hit your limit? You'll get a 429 error. Back off and retry.
Browser Limits
- Check active browsers: GET
/active
or/profiles
- Exceeding your plan limit triggers 429 errors
- Got an error? Include the trace ID when contacting support:
x-cloud-tracing-uuid: fea367e7cfc840818508754b5f1c1f51
Closing Browsers
Browsers close automatically after inactive_kill_timeout
seconds of inactivity. You can also close them:
browser.close()
in Playwright/Puppeteer- Stop endpoints via API
Always close browsers when done - it keeps you within limits and saves resources.
Performance Metrics
Track what matters:
print(f"Success rate: {(metrics.completed/metrics.total)*100:.1f}%")
print(f"Avg time: {metrics.avg_time:.2f}s")
print(f"Failures: {metrics.failed}")
Common Mistakes
Over-parallelization
- Going beyond your plan limit causes 429 errors
- Target sites might also rate limit you
Poor error handling
- Network issues happen - plan for them
- Always add retry logic
Resource leaks
- Browsers consume memory over time
- Recycle them regularly with
max_browser_tasks
The Latency Reality Check
Local vs Cloud
Your local dev environment has near-zero latency. The cloud doesn't. Here's what that means:
Local Development
- Command latency: 1-5ms
- No network overhead
- Instant responses
Cloud Reality
- Command latency: 50-100ms (network RTT)
- With proxy: +100-300ms
- Multiple commands multiply the delay
Real Example: Login Flow
- 10 commands locally: ~50ms total
- Same commands in cloud: ~1000ms
- With proxies: ~3000ms
Each "simple" action might be several CDP commands. A click involves finding the element, scrolling to it, and clicking - that's 3 round trips.
Speed Strategies
1. Go Parallel
# Slow - sequential
for element in elements:
await get_text(element)
# Fast - parallel
await asyncio.gather(
*(get_text(element) for element in elements)
)
2. Pick the Right Tool
- WebSocket frameworks (Playwright, Puppeteer): Faster
- HTTP frameworks (Selenium): Slower
- Low-level CDP: Fastest but harder
3. Framework Tricks
# Playwright example
# Slow - simulates typing
await page.keyboard.type("text")
# Fast - sets value directly
await page.locator("input").fill("text")
4. Location Matters
- Browsers close to your servers = 8-9x faster
- Consider multi-region deployment
- Ask us about colocating near our infrastructure
Design for parallel processing from day one. The performance difference between sequential and parallel can be 10x or more at scale.
Need More Speed?
Running near our infrastructure dramatically improves performance. Contact us to discuss low-latency deployment options.