I have a small email newsletter system and when a new email address is added (through mass import) it defaults to "subscribed int(11) DEFAULT '1'`".
I then have the below query which looks for the email address and updates any that are already in the table but are un-subscribed :
UPDATE emailData SET subscribed = '0'
WHERE subscribed = '1' and emailAddress IN
SELECT emailAddress FROM
(SELECT emailAddress FROM emailData WHERE subscribed = '0' GROUP BY emailAddress) AS tmptable
With around 5000 duplicates it takes around 15 seconds to execute (VM Server) and I wanted to know if there was a better / faster way to do this?
Yes, there's a better way - don't have duplicates. Create a unique (primary?) index on email address and set your import operation to ignore duplicates (otherwise it will still do the import but generate lots of errors).
1. Need to change query as per below-
UPDATE emailData AS ed JOIN emailData AS ed1
ON ed.emailAddress = ed1.emailAddress
SET ed.subscribed = '0'
WHERE ed.subscribed = '1' AND ed1.subscribed = '0';
2. emailAddress field data length should be short as much possible may be varchar(50) or varchar(60) if possible.
3. make a composit index on emailAddress and subscribed fields.
Note: If emailAddress field is text or like varchar(250) etc and you can't short it then make partial index like first 50 characters only which will be enough and fast.
Asked in February 2016Viewed 3,692 timesVoted 5Answered 2 times