How do I generate all possible words of a regular expression with the following rules and syntax:
- the user inputs the alphabet;
- the user inputs the expression;
- any character that's not ()*+ or space can be part of the alphabet;
- the + character chooses between the character or sequence on its left or the one on its right;
- the * character allows one or more repetitions of the character or sequence on its left;
- two alphabetic characters in sequence will be concatenated;
- parentheses may change precedence.
I'm getting alphabet and expression as strings from user, and casting it to python lists. Then I proceed some simple validation tests on both, based on the rules above.
After that, currently, my algorithm already generates correctly the words of expressions WITHOUT parentheses. And here's my problem: I haven't found yet a way to manage parentheses properly to correctly generate the words. As I see, I have two options:
1) find a way to calculate the possible words directly from the original expression, or
2) somehow eliminate all the parentheses and have my current algorithm to solve it.
I wonder if I can get the first option done using some kind of recursive function, although I still don't know how. Any thoughts?
Here's the code so far (sorry for comments in portuguese, brazilian here):
alphabetInput = input(
"Informe os caracteres do alfabeto (OBS.: os símbolos '(', ')', '+' e '*' são caracteres reservados):\n")
alphabetInput = alphabetInput.strip() # remove os espaços em branco da string
alphabetInput = list(alphabetInput) # transforma a string numa lista
# remove as duplicidades do alfabeto
alphabet = 
for c in alphabetInput:
if c not in alphabet:
for c in alphabet:
if c in "()*+":
print("Alfabeto inválido (OBS.: os símbolos '(', ')', '+' e '*' são caracteres reservad