Intermittent problems sending order confirmation mails

kdxperbol · October 28, 2021, 10:42am

We have intermittent problems sending order confirmation mails. Sometimes they are sent out, sometimes they aren’t, and the root cause seems to be that

The mail service can’t get the HTML for the mail because the web request fails
The scheduler doesn’t retry the send even though it says it will

Here is the actual error:

Execution of background processor Litium.Application.Scheduler.BackgroundJobWorker, Litium.Application failed, executing again in 5 seconds.

System.Net.WebException: Unable to connect to the remote server —> System.Net.Sockets.SocketException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond [load balancer external IP]:443
at System.Net.Sockets.Socket.DoConnect(EndPoint endPointSnapshot, SocketAddress socketAddress)
at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Exception& exception)
— End of inner exception stack trace —
at System.Net.HttpWebRequest.GetResponse()
at Litium.Accelerator.Services.MailServiceImpl.MailServiceProcessor.GetWebPageContent(…

Only one error appears per failed order confirmation mail send, and only this error.

While beginning to troubleshoot this, I have some questions:

Is it correct/normal that it should use the external load balancer IP to fetch the mail body? If not, where would I configure this?
Does the scheduler run on all servers in a cluster? I can’t find any references to the scheduler in configuration.
Why does the scheduler not retry the send even though it says it will in “5 seconds”? Is this something that needs to be configured?
Some other pointers on how to troubleshoot?

Litium version: 7.6.1 + accelerator @ Litium Cloud production environment

kdxperbol · October 28, 2021, 10:56am

Connecting to the external load balancer IP from the cluster servers does not work. At least not for http/s. And it seems one of the servers in the cluster did not have the external site address hosts-filed to 127.0.0.1 but the others did. So the problem was most likely that one server tried to fetch HTML externally but failed.

So it seems that unless the “server address” can be easily configured (my question 1 above) all cluster nodes must resolve the external site address to an internal address (either a cluster node or localhost).

Still interested in questions 2 and 3 above even though the issue may have been resolved.

patric.forsgard · October 28, 2021, 11:11am

For the 2 and 3 you have the answer here…

Yes, it is executed on all servers in the server farm and distribute the work between the servers, only one server process a scheduled job.
The message are misleading, it’s the BackgroundJobWorker that will execute after the delay of 5 sec and that trying to find next scheduled message that not already have been processed. For the already processed message; in this case the sending of mail, have been marked as failed and will not be processed again.

system · November 25, 2021, 11:12am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Error when sending Confirmation Email on local machine Questions	3	538	May 12, 2020
Not receiving order confirmation email Questions accelerator , litium-7	3	472	August 8, 2019
Exception when sending email using litiumdrift smtpserver Questions litium-7	5	691	February 5, 2021
Problems with sending email via litium drift server Questions	3	423	March 26, 2021
Problem with sending multiple mails on OrderConfirm Questions	6	540	November 19, 2019

Intermittent problems sending order confirmation mails

Related topics