On unix-like systems they're normally implemented in terms of a pthreads semaphore/mutex pair.
The entire operation should take place in user space so you don't pay to switch to kernel mode, but you will be working with two synchronisation primitives under the covers. This will mean that memory fences will be issued, so there is always some price to pay.
To cut a long story short, calling notify_one when you should, i.e. after changing the state of the condition and releasing the lock, it's a reasonably cheap operation. Calling notify_one in a tight loop for no good reason is probably not going to be a good idea.
What happens if a thread calls notify_one when there is no waiting thread?
Take a mutex, check whether there are threads waiting, release the mutex. end.
Does this call have any performance impact (CPU-cycles, power etc)?
Yes of course, it consumes a few cycles and requires that the CPU is operating. Doing it once in a while won't hurt. Doing it continuously in a tight loop will consume power.
I guess my question for you is, "what's the use case"? If you're adding a million items a second to a producer/consumer queue then you're going to spend a lot of time and energy notifying nonexistent consumers. If you're adding 10 a second, time spent in notify_one probably won't even show up on any performance trace.
These questions are extremely implementation-specific. Just saying you're on Windows is not enough; each standard library may have different implementations, and a debug version could have a different implementation than a release version.
The semantic effect of notify_one when no thread is waiting is a no-op. In implementation terms, at the very least the thread has to check an atomic variable to determine if any threads are waiting. So there is a bit of overhead.
The Microsoft standard library's condition_variable is implemented in terms of the concurrency runtime's condition variable which, starting from Windows Vista, is implemented in terms of the WinAPI RTL_CONDITION_VARIABLE. The implementation of that is not accessible. However, there's a reasonable chance that its implementation is based on this Microsoft research paper: