Design TinyUrl
- counter计数器
- pros: 简单
- cons:
- 允许范围为int的范围。
- url不一定更短。
- easy to predict
- 在计数器的基础上,引入字符编码。即根据counter来决定用什么字符。每个62个counter多一个字符
- cons:
- integer overflow
- 不一定更短
- easy to predict
- cons:
- 利用
hashCode()方法,- 数量仍然受到int的限制,因为hashCode是integer calculation
- length仍然不确定
- hashCode不能保证唯一性,可能存在hash collision,同时随着数量的增大,collision的概率也越来越大了。并且在int的limit到来之前就有可能发生collision。
- random number直接作为"counter"
- 仍然收到int限制
- 长度不定
- 随着数量的增大,collision的概率也更大了
- random number + fixed length string
- 允许数量变多62^6
- 无法预测,可以灵活的增加
- 还可以用空间换时间,增加一个HashMap更快
https://segmentfault.com/a/1190000006140476
How many unique identifiers possible? Will you run out of unique URLs? (10+26+26)^6
Should the identifier be increment or not? Which is easier to design? Pros and cons? Yes
- pros: more urls. hard to be decoded
- cons: not tiny anymore, could be very long
Mapping an identifier to an URL and its reversal - Does this problem ring a bell to you?
- LC 535
How do you store the URLs? Does a simple flat file database work?
What is the bottleneck of the system? Is it read-heavy or write-heavy?
Estimate the maximum number of URLs a single machine can store.
Estimate the maximum number of queries per second (QPS) for decoding a shortened URL in a single machine.
How would you scale the service? For example, a viral link which is shared in social media could result in a peak QPS at a moment's notice.
How could you handle redundancy? i,e, if a server is down, how could you ensure the service is still operational?
Keep URLs forever or prune, pros/cons? How we do pruning? (Contributed by @alex_svetkin)
What API would you provide to a third-party developer? (Contributed by @alex_svetkin)
If you can enable caching, what would you cache and what's the expiry time? (Contributed by @Humandroid)