Modify

Opened 5 years ago

Closed 3 months ago

#119 closed defect (worksforme)

Pull relays do not reconnect, recover on 2.0.36

Reported by: mark@… Owned by: mark@…
Priority: major Component: Professional Caster
Version: Keywords:
Cc:

Description

We have run into an issue with relays reconnecting to sources that go away and then come back. Our relevant configuration is:

relay pull -i u:p -m /RTCM3EPH products.igs-ip.net:2101/RTCM3EPH

max_clients 10000
max_clients_per_source 1000
max_sources 40
max_admins 2
throttle 0

max_ip_connections 1000

I'm going through the code trying to understand how timeouts to read connections like this would be applied. Is there some configuration we're missing that could help us recover quickly?

We had a recent outage where our BKG relay did not recover after a relay source went down for 15 minutes - the BKG relay stayed down for 2 hours, while our other caster (a SNIP) recovered the stream after 15 minutes.

Any advice in debugging this would be very appreciated. I'm thinking about adding a setsockopt call with SOL_TCP and TCP_USER_TIMEOUT on the sockets to improve timeouts.

Attachments (0)

Change History (3)

comment:1 by stuerze, 4 months ago

Status: newassigned

comment:2 by stoecker, 4 months ago

Owner: changed from stoecker to mark@…
Status: assignedneedinfo

Somehow that got overlooked. Is this still an issue. Can you reproduce or describe what's required for that to happen because on the instances here it works as expected.

comment:3 by stoecker, 3 months ago

Resolution: worksforme
Status: needinfoclosed

Modify Ticket

Change Properties
Action
as closed The owner will remain mark@….
The resolution will be deleted. Next status will be 'reopened'.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.